Mathematics - Multivariable Calculus - Lecture 8

Uploaded by UCBerkeley on 17.11.2009

No, I'm working out but walking out.

I will be here on Thursday, so inasmuch as I sympathize with
the organizers, and I think that they really have a very
valid point, they raise a very valid point, I also feel that
my job is to be here, so I will be here Thursday.
Any other questions?
Perhaps even more important, next Thursday we'll have
our first mid-term exam.
Yeah, everybody's excited.
So I already put an announcement on the bSpace
page, which, by the way, you should check bSpace site
because there are all kind of interesting stuff in there,
and useful information also.
So, homework solutions, something I have already told
you about, but also there will be some information
about the exam.
This week I'm going to post a mock mid-term so that you will
have a chance to practice for the mid-term, and we'll have a
review lecture next Tuesday, a week from today.
And the exam is on Thursday exactly during the class hour
-- one hour, 20 minutes, more precisely.
So, we'll start at 3:40 and finish at 5:00.
Because this room is not big enough, you know, because
everybody would be packed, so I have requested another room, so
we'll have to separate into two groups.
I will post information about this and I will tell you about
this and everything about this on Thursday or next Tuesday.
And now I go back to what we talked about last
week, which is limits.
I just want to say a few words about limits, and then we will
move on to the next subject, which is partial derivative.
So about the limits -- too many late arrivals today.
I'm going to charge a late arrival tax if this continues.
So let's just quiet a little bit because it will be good
for everybody if we are more focused.
So limits.
I illustrated limits but by way of an example and I looked at
the particular function in two variables, namely x squared
divided by x squared plus y squared near xy equals 0,0.
And the reason is that at this point the denominator becomes
equal to 0, and so this expression becomes problematic,
it may or may not have a limit.
If we did not have that, if, for example, we looked at this
function near a point xy equals 1, 0 or 1, 1, a point where the
denominator is not equal to 0, that would be an easy
question to handle.
But precisely when the denominator becomes equal to 0,
we have to be careful, and we have to analyze it
more precisely.
And so in this particular case, I explained that this function
doesn't have a limit.
This function does not have a limit at this point.
And I explained this, explained the reason.
The reason is that we could find -- the reason I gave was I
showed two different paths, on the xy plane which approach 0.

One was the path when x is equal to 0, so
it's a green arrow.
And the other one was the path when y is equal to 0, and
that's the red arrow.
And we have seen that along the first path, when x is equal to
0, when we look at this function on this path, very
close to 0 but not quite equal to 0.
What we get is 0 divided by 0 plus y squared.
But as I said, we really look at it not at this point
but just near this point.
So near this point y is not 0.
So this is not 0 near the point.

Because it's near the point, we can actually evaluate this, and
we see that for all values of y near this point, this is
actually just plain 0 because you divide 0
by something non-zero.
And so this means that along this path, there is a
limit and it's equal to 0.
On the other hand, if we look at this path, we get x squared
divided by x squared plus 0, and again, we will assume that
x is very close to 0 but not quite 0 yet.
So this is actually a non-zero expression, and therefore, we
can cancel them out -- this again is non-zero, and
this actually gives us 1.
So that means that along this path, again, there is a limit
along this path, there exists a limit, there is a limit
along this path, namely 1.
So we have found two paths along which we achieve a
different limit, we obtain a different limit.
That means that the function itself does not have a limit,
does not have a limit, because to say that the function has a
limit, if along any path approaching the point the limit
exists, then all of them are the same.
In this particular case, clearly they are not the same
because there are two path, at least two paths along which
the limits are different.
So how can it be that a function has different limits
along different paths, in other words, what is
the meaning of this?
What is the geometric representation of this?
Well, to explain that we can already see an analog of this
phenomenon for functions in one variable, which I
explained last time.
In the case of one variable, we look at the graph of a function
as a geometric representation, and we can have the following
situation where the graph this discontinuous.

So graph is discontinuous, for example, like this.
So there's some point, x, 0, and if you approach from the
left, you're going to end up at this point and this value, and
if you approach on the right you end up with this value.
So, that means that the function does not have a
limit at this point, because along different paths you
get different limits.
They are both finite, so in some sense you can say, you can
argue that it's a more benign situation than let's say
situation over hyperbola where actually it goes to infinity.
Here, by the way, on the right it goes to plus infinity and
the left goes to minus infinity.
So, in some sense it's even worse than just going to
infinity, it actually goes to two different
infinities in some sense.
But first of all, if it goes to infinity along any path,
it's already, we already say it doesn't have a limit.
But even if it's finite, this situation, which is in some
sense more benign, it still doesn't have a limit, it has a
limit from the left, has a limit from the right.
In the case of one variable, there are only two possible
paths which converge to this point.
You can go sort of with different speed, different
velocity, but it doesn't matter, I mean geometrically
it's the same path, either this one or this one, only two.
And already that creates trouble.
In the case of functions in two variable there
are many more paths.
I have drawn two paths here -- here is one, here is another,
but you have many more, right, you also have a path like this.
Finally, a path doesn't have to be straight line,
it could be a spiral.
So, there are many, many different paths, and that's
the difference between two variables and one variable.
To say that the function has a limit for a function in two
variables is a very strong statement, it's a statement
that along any of those path, you're going to achieve
the same limit.
So, this actually shows that a way it's much easier to
disprove something here, to show that the function does not
have a limit than to show that the function does have a limit.
Indeed, to show that it doesn't have a limit, it's sufficient
to just exhibit two different paths along which you have
different limits, and usually it's pretty clear which
ones you should take.
For example, in this case, you just look at
x equal 0, y equals 0.
Sometimes you might need to look at a linear path, like
this one where x is equal to y, again, to convince yourself
that indeed the function doesn't have a limit.
But to prove that actually the function does have a limit is
much more difficult because it wouldn't be enough, for
example, to say that the limit along this path and along
this path is this same.
You would also need to show that the limit along this
path is the same, and along infinitely many other paths.
So to show that it has a limit is more difficult, and this is
not a very efficient way to show it.
It is very efficient -- this way of argument is very
efficient to show that it does not have a limit, because then
it's enough to show just two for which the limits are
different, but to show that it has a limit, it wouldn't be
enough to show that along two paths you get the same answer.
You have show that the same goes
for all paths.
So, for practical purposes, what you need to know is the
argument showing that it doesn't exist.
You need to know this way of argument.
When the limit doesn't exist you should be able to
demonstrate that there exist two paths along which the
limits are different.

The distance, right, the distance, what really matters
is the distance -- what should matter -- so the question is
what really matters, the shape of the spiral or the shape
of the curve or the angle.
It depends on the situation.
The point is that to have a limit means that as soon as you
get close, say within 1 over 100 of an inch of the origin,
the answer is going to be within some small neighborhood
of the value that you claim is the limit.
It doesn't matter how you approach it, it should
be within that limit.
If it's 1 over 1,000 of an inch it should be even closer, if
it's one over a million should be even closer and so on.
So the notion of a limit should not rely on the way you
approach, it should be about, it should be uniform with
respect to all directions and all points in the neighborhood
of the point, of the point 0, 0.
So, in a sense this argument that I gave you is
kind of misleading.
It's a very nice argument to disprove the existence a
limit, to show that the limit does not exist.
But it is misleading if you try to think in this way about
the existence of limits.
So, for existence of limits, you have to use a different
kind of argument.
And now I'm not going to require that on the exam, but
I'm going to, just to give you an idea how it works, I'm
going to explain it in the following case.
Suppose you have a function which is just slightly
different from this at first glance, namely instead of x
squared I will take x cubed.

You see, so what happened?
So, the problem here, the problem with this function
was that both numerator and denominator had degree two.
So both numerator and denominator in some sense are
going to 0 at roughly the same speed, but not exactly.
It depends along which direction you go.
If you go on this one, it will 0, along this one it's one,
along this one, for example, if x is equal to y, you can
see it's going to be 1/2.
But because the powers are the same, that's why you end
up with different answers.
What happens now is that I put the numerator, I choose as the
numerator x to the third power, and the third power goes to 0
much faster than second power.
So that's why this will dominate and this will kill
this guy, so it will become 0.
Well, it's hardly kill, because they both go to 0, so in some
sense it goes faster to 0 so it doesn't really kill it.
but it depends on your point of view.
So, how would I show that this actually has a limit.
So I claim, I claim that this function does
have a limit at 0,0.
Namely the limit is equal to 0.
How would I show that?
Well, for that I would actually have to estimate the value of
this function, I would have to estimate the value of this
function when I approach 0.
So I will say let's suppose that xy belongs to the small
disk of some radius r, and I purposefully don't want to use
delta and epsilon, not because I don't like pre-cals a bit,
but because I know that people immediately feel violated when
I try to talk about epsilon and delta.
So, if I use a different letter you will feel much more
comfortable, some of you will feel more comfortable.
There is a certain -- just the way it sounds -- actually, one
of the students told me that it reminds him of going to
a dentist's office, epsilon and delta.
So, let's call it r, and let's look at the disk of radius r.
I'm drawing it as a big disk, but actually you should think
of it as very small, I just magnify it.
So, this is dr and I want to look at all the points here,
and I want to estimate, I want to estimate of the value
of this function for all points within this disk.
So the function, as I explained already many times is a rule,
which assigns to each of these points a certain number, which
is the value of the function, right, so that's
the function, f.
And I want to estimate the value of the function.
What I want to show is that the closer I get here, the closer
I'm going to get to the neighborhood of 0
value on this line.
So how can I estimate this?
Well, what does it mean that belongs to this?
It means that x squared, the square root of x squared
plus y squared is less than or equal to r.
That's what it means, because that's the distance, that's how
we measure the distance to 0. to belong to a disk of radius r
means that for your coordinates x and y, the square root of x
squared plus y squared is less than or equal to r.
So that means that x squared plus y squared is less than
or equal to r squared.
But you see both of these are positive, so if the x squared
plus y squared together are less than r squared,
this also implies that x squared is less than r squared,
so x is actually less than or equal to r.
This is actually also clear geometrically because if you
have all points in the disk, you will see that all of them
will have the x coordinate less than r, except for the two
points which lie on the intersection of the
circle with the x-x.
So, I know this as soon as my point is in dr,
and I also know this.
So now, what I would like to do is I would like to
write the folllowing.
I would like to write x cubed divided by x squared plus
y squared, and I want to measure the absolute value.
What matters is that absolute value.
When I say it's close to 0, it doesn't have to be
positive or negative.
It should be very close to 0 in either side.
So that's why less to estimate it's better to take the
absolute value rather than just the value.
And so now I want to write it like this.
I want to write as x times x squared divided by x squared
plus y squared -- I've done nothing, I just pulled
one x of this fraction.
And then I want to write it as x times x squared divided by
x squared plus y squared.
So now I want to look at this, and clearly, this is less than
or equal to 1, because in the numerator I have a squared,
but in dominator I have a
squared plus y squared, and this guy's always
positive or 0.
So the largest value this can obtain is 1, when y square is
0, but if y square is not 0, the numerator will be strictly
less than denominator.
So, this fraction is less than or equal to 1, and x, as I have
just explained, is less than r -- less than or equal to r.
So that means that the whole thing is less
than or equal to r.
So you see I have been able to effectively estimate the
value of this function.
I can not find it for all x and y -- I mean they're
all different for different x and y.
But I can say for sure that as long as x and y, as long as the
point xy is within that disk of radius r, the value is going to
be less than r -- less than or equal to, it doesn't matter.
So that means that if I make this disk smaller and smaller,
in other words, as I take r closer and closer to 0, this
value of the function -- more precisely the absolute value of
the function, woll also tend to 0, because for all points
within that disk, all points, this value is going to be less
than or equal to r, and I have control over r because I
can take my points in a smaller and smaller disk.
Nowhere in this argument am I talking about particular paths
approaching 0, I'm talking about the entire disk of radius
r, and then I'm squeezing that disk, I'm taking smaller and
smaller and smaller, and by being able to use this estimate
I can control of the value as a function of the radius.
I can say that if I am within the radius 1 inch, the value
of the function is going to be less than 1.
If I'm within the radius one over 100 of an inch, it's
going to be less than 100 -- one of a million.
In other words, I can get as close as I want to 0 for
the value by choosing sufficiently small disk.
Now that's the argument, that's the proof that this function
has a limit and the limit is equal to 0.
And that's what traditionally mathematicians explain in
the epsilon delta language.
But that's all it is, that's all it is is just saying that
if you take a disk of radius r, then the value of the function
for all points within that disk is going to be less than r or
less than something which becomes smaller with r.
So that's the argument.
Are there any questions about this?
So, they ultimately have to prove that that function
will be smaller than r.
That's right.
In general it will not necessarily be r, it could be
-- let's suppose I prove, let's suppose I had a different
function, for example, I had 2 times x cubed, then I would
prove it's less than 2r.
That would still be OK, because I have to get something which
will become smaller and smaller as r becomes smaller.
Or if I get r to the 1/2, the square root of r, I would
also be OK, or r squared.
It will not be OK if I just say that it's less than 1, that the
whole thing is less than 1.
That wouldn't help me.
That would just tell me that I get within a certain range, but
that range doesn't get smaller as the domain gets smaller.
I have to make sure that the range will get smaller as
the domain gets smaller.
And that's a perfect situation where I didn't have to
make any adjustments.
I get r on the nose as the estimate, and the largest
possible value for this.
Another question?

Would that only work -- the question is whether this
is a general argument.
This is not a general case because in general you're
going to have maybe some polynomial in x and y
with additional terms.
I have really looked at the simplest case.
But in general the argument is going to be very similar and if
you like in the book there are more examples of this
type which are analyzed.
But as I said, I'm not going to require you to know this, so my
point here is just to give you an idea of how this
kind of proof works.
And I think that even though this is the simplest example,
it already illustrates this idea.

L'Hopital's Rule.
Well, L'Hopital's rule is really specific to functions
in one variable because you have to differentiate.
So, the question was about L'Hopital's Rule, which was
one of the powerful methods to finding limits for
functions among variable.
And so the way it works is that if you have a function which is
say p of x divided by q of x and both of these tend to 0,
say, and you don't know what the limit is, you can estimate
the limit by taking derivative.

So now the situation is different because now we have,
this is one variable case, and now we have two variables, and
in two variables we have functions in two, say,
p of xy and q of xy.
So let's say I wanted to generalize this, I would have
to take some sort of derivative here, right, and this actually
brings out the question which we study next, which is what
kind of derivatives can we do for functions in two
and three variables.
So clearly, there isn't a single derivative because
derivative is about the rate of change, the rate of change.
In a 1-dimensional case there is only one direction in which
you can change your variables.
You can increase that -- you know you can
just go away from x.
Apart from the fact that you can change it going left and
right, there is essentially only one way to change it.
More precisely, there is only one degree of freedom that you
can change, you can only change in one direction.
But in two variables there are more directions in which you
can change and estimate the rate of change, and therefore,
there are many more derivatives that are possible.
So in fact this is very close to what we discussed up to now.
There are so many different ways to approach 0, and we
have to be able to take care of all of them.
So, but L'Hopital's Rule in two dimensions would give us at
best is the way to approximate the limit along a
particular direction.
So let's say if I were to take the derivative with respect to
the x direction, then I would be estimating what happens when
I approach along this line.
And if I were to take the derivative with respect to y, I
would be estimating along the green line, along the y-axis.
But neither would give me the full picture.
The full picture I can only get as I have argued, by sort of
looking at path, right, looking at all possible directions.
So in that sense the L'Hopital's Rule
doesn't help us.
Now, sometimes, sometimes it happens that you can convert
this function that you have in two variables, into a
function in one variable.
One of the exercises, I think it's the last one on homework
is about this where you can use polar coordinates to realize
that the function you have is something like -- it's in both
x squared plus y squared.
I think it's something like logarithm of y squared times
x squared plus y squared, something like this.
So you realize that actually it only looks like it's a function
in two variables, but actually, it's a function in one
variable, namely this.
And then you are back to one variable case, and then it
becomes a fair game to use all the methods that you know
in one variable case.
A general function is not going to be like this.
For example, this one is not like this.
So, I can not directly use L'Hoptal's Rule for two
varibles, and there isn't any obvious way to use it because
there is more than one possible derivative, which actually
brings up the question as to what are possible derivatives
for functions in two variable, and that's our next subject.
So, I have already kind of alluded to the answer.
Because there are two variables, we can actually
differentiate to respect either one of them and we get a
meaningful derivative, and this I call partial derivatives.

So what are partial derivatives?

So we have a function, let's say we have a function f of xy
in two variables, and when we talk about derivatives, we
should, first of all we should fix the point, we should fix
the point at which we are taking the derivative, because
for a different point you have different derivative.
It' the same thing happens for functions in one variable.
So let's say we have a point which has coordinates a and b.
What we can do is we can convert this function into a
function in one variable, but by freezing one
of the variables.

So, for example, we can say y is equal to b -- 3's the second
variable and say that y, the variable y is equal to b,
which is at the second coordinate of this point.
So what we get then is a f of x and b, and let me indicate the
fact that I have frozen, I have frozen this by red, so red
would be a fixed value.
So here also would be, I will put them in red to indicate
that these are numbers like 1 or 5 or 27 over 11,
whatever you want.
But x is a variable, so x we can plug in number you want
and you'll get an answer.
So you want to view it still as a function, but because you
have frozen one of the two variables, there is only one
variable that's remaining.
Therefore, what you get is a function in one variable only.

And once we get a function in one variable, we can then
differentiate it just in the usual way how we differentiate
functions in one variable.

Differentiate it, so we'll get a g-prime and then we can
substitute the value that we wanted, the value a.
So then finally we get a number.
So in other words, we have a function and two variables,
first we freeze one of the two variables, and then we take the
derivative with respect to the second variable at the
particular value of that variable, namely a.
So the result of this is what's called the first partial
derivative, or partial derivative with respect to x.
Partial derivative with respect to x at this
point, at the point ab.

And the notation for this is f sub x of ab.

Likewise, I could freeze the second variable -- I mean the
first variable x, I could say x is equal to a.
Then I get a function again in one variable where the first
variable is frozen but the second one is free.
So I get a function of one variable, let's call it h, and
then I can differentiate it.
So what I get is this h-prime of b.
And that's called the partial derivative with respect to y
for which the notation -- I'm just abbreviating the same
sentences I have at the top of this board.
The notation for this is, obviously, f sub y of ab.
So we got two derivatives for functions in two variables,
the first and the second.
So now let's look at the example of what
this looks like.
Let's say f of xy is x to the 5 plus x times y cubed plus
cosign x times e to the y.
And we would like to find the partial derivative.
Now, when I define them, I was insisting that the value of --
that the derivative has to be evaluated for particular values
of a and b -- sorry, particular values of x and y, which
I denoted by a and b.
Just like in the case of a function one variable, let's
say if you have a function, let's say for function one
variable, say f of x is equal to x cubed.

I could say that the f-prime for any value a will be --
well, will be 3a squared, right.
That's the rule, because I know the rule.
The rule is that the derivative of x cubed is three times x
squared, and then if I substitute x equals
a, then I get this.
So usually we don't write it like this.
Usually we just write f-prime of x is x squared.
In other words, we would like to look at not just one value
of the derivative for particular value of x, namely
a, but at all of them.
For all possible values of x, we would like to know what the
value of the derivative is, and then we can substitute x equals
a, for example, 1/2, then you will get -- I'm sorry, I forgot
3, 3f squared, and no one corrects me or at least I
didn't hear. 3f squared, of course, for the function, and
we substitute x equals a and we get 3a squared.
But it's too pedantic to go this long way each time and say
well, if I ask what is the derivative of this function,
you say well, for a given value a, the derivative at the point
a is going to be 3a squared.
Instead we just write f-prime of x is 3f squared.
And that, what is understood is that if I want the value at the
particular, for a particular a, i'll just plug a into this
formula and I'll get the answer, right.
So we will use the same shorthand for
partial derivatives.
In other words, we will not be, I will not be writing each time
that fx of ab, I will just write fx of xy.
I will just write it for just fx sometimes, and that would be
a function of x and y so that if I substitute a instead of x,
b instead of y, I will get the value of the derivative , of
this particular partial derivative at that point.
So let's see how it works in the case.
In fact, nothing could be easier.
You just look at this function and in order to calcualte the
partial derivative with respect to x, you just view y as a
parameter, but not a variable.

This is exactly what I meant when I said that we freeze y.
It just means that we view y as a parameter.
And then you just differentiate what you see.
Well, what do you see, you see x to the 5, so you get 5 x to
the fourth plus this, you differnetiate this is y cubed
plus differentiate this you get negative sign x
times e to the y.
And that's it.
That's the answer.
That's the way you write the answer.
Now, if you are given some x and y, some values for x and
y, like a and b, you can substitute them and
you'll get a number.
But in fact you can view of this first partial derivative
with respect to x as a function of x and y, which I just
obtained in this way.
Likewise, we view x as a parameter, and then take the
derivative with respect to y.
So if x is a parameter, then from this point of view
this is just a constant.
It's independent of y.
Therefore, its derivative is 0.
So it's going to be 0 plus here it's also parameter, so it was
just differentiate y cubed, so we get 3x y squared.
Of course, sign x is also constant, and the derivative
of e to the y is e to the y.
So that's the answer for the second partial derivative.
Is that clear?
So this is really straight forward.
You only need to know how to differentiate functions
in one variable.
Why doesn't cosign go away.
In which one?
In this one.
Well, let's suppose instead of this you had
5 times e to the y.
Then the derivative would still be 5 e to the y, or any other
constant would just show up as overall factor.
So in the event, the constant is cosign x.
That's what I mean when I say we treat x as a parameter.
If we treat x as a parameter it's treated as a number, and
so any expression involving x, like cosign x, is
a fixed number.
So it just it shows up as an extra factor.
Any other questions?
So next I would like to explain the geometric meaning of this,
because as you see in this course, algebra and geometry go
hand-in-hand, and all of the concepts that we discuss
algebraically, they have geometric interpretation,
which is very important.
So, for functions in one variable, the derivative of the
function has to do with the slope of the tangent line.
In one variable derivative gives the slope of a
tangent line to the graph.
And the way we draw it is like this.
We have xy plane, we have a function, f of x, we have y
equals f of x with the graph.
Note again that y here has a totally different meaning than
y in here. y in here is a second variable, so it's on
equal footing with x -- x and y are two independent variables.
But now I'm talking about functions in one variable, so
there's only one variable x, and y is not a variable, it's
actually -- it denotes the value of the function.
I already talked about this.
It's an unfortunate choice of notation, but that's how it is
so I'm not going to change it.
So, we pick a point, let's say x equals a and we draw a
tangent line, and we know that the derivative, let's say the
angle is theta, that the tangent of theta is a
derivative f-prime of a.
That's the geometric meaning of the derivative of the
functions in one variable.
So then it's natural to ask what is the meaning for
functions in two variables.

To understand that, we have to look at the graph of the
function in two variables.
What does that look like?
Well, for a function in one variable, a graph is a curve on
the plane, and I already talked about it many times why do we
need a plane, because to represent the graph, you have
to have your variables and you have to throw in one additional
variable which will represent the value of the function.
For function two variables, there are already two variables
to begin with, and to draw a graph we have to throw in one
more, one more variable, which will represent the
value of the function.
So, as the result, the graph of function in two variables is
going to live in 3-dimensional space, so it will have
coordinates x, y and z, and we will have a graph of this
function which will be a surface.
So I would like to just draw part of it, which lives
in the first octant.
On the plane this coordinate system brakes the plane
into four quadrants, four coordinates which are called
quadrants, right, because there are four of them.
In space the coordinate planes break the entire 3-dimensional
space into 8 pieces, which are called octants.
So this is one octant, it's looking at us like this.
And so the graph actually lives everywhere, but I have just
drawn the intersection of the graph with each of the
coordinate planes.
And so you should think of this as something like a dome,
like a sphere, like part of a sphere.
It's not necessarily a sphere, I'm just -- just like
this is not a circle.
I mean I'm just doing a sample graph.
And to emphasize this, I want to show a particular
point on this.
So let's say I take this point, and so this
point has coordinates.
Do find the coordinates I have to drop perpendicular on he xy
plane, so that's going to look like this, and then that's the
z coordinate, maybe a little higher.
So this point is -- what is this point?
So this point has coordinates a and b, and the third coordinate
is a value of the function, because it lives on
this yellow surface.
I don't want to shade it because otherwise it will not
be clear what am I shading, am I shading this, am I shading
the plane and so on.
So just try to imagine that there is something here which
looks like it's part of a sphere, and that's the
point which belongs to it.
And that's the graph of a function, f of xy, so it's
defined by the equation z equals f of xy.
And this is a particular point, let's go with p which has
coordinates ab -- these are given, these are given, these
are just the values of x and y.
And what about the z coordinate?
Well, since it's a graph, the z coordinate has to be f of
the x and y coordinates.
So that means that I have f of ab.

So that's what this point is.
So this is a of ab.

Is it clear so far?
So now what is the slope?
What is the slope of the graph?
Well, first of all, it's not the slope of a graph, it's
the slope of a tangent line.
So here, actually, it doesn't make sense to talk about the
tangent line to the graph, because a line is
1-dimensional, and the graph is 2-dimensional.
How can it not be 2-dimensional if we have a function
in two variables.
Function one variables will have a graph which is a curve,
but function two variables has a graph which is a surface,
so it's 2-dimensional.
So it doesn't make sense to talk about a tangent line
unless we make some choices, give some additional
So in fact, the proper notion here is a tangent plane, and
this is something we'll talk about on Thursday.
So that's really ultimately what we would like to
understand is the analog of this picture in 2-dimension and
the full -- to get the full analog of this picture, we
should really talk about the tangent plane.
But for now I have a more limited goal.
I want to illustrate the concept of partial derivatives.
And when I talked about partial derivatives, I said that I
freeze one of the two variables, and then I basically
go back to the 1-dimensional case, so the case of a
function one variable.
So that's what I would like to do.
I don't want to talk immdiately about the tangent plan, the
entire tangent plane, I want to see what I get when I freeze
one of the variables.
So in my algebraic to calculation on that board,
I first of froze the second variable y.
So what happens if I freeze y?
If I freeze y it means that I look at the part of the graph
which has a fixed y coordinate, namely b, so it means that
I cut this graph with the plane, which is y equals b.
So the result is going to look something like this.
So the blue, this blue, is the intersection with
the plane y equals b.
That's what I get.
So now, instead of a surface, I get a curve.
This intersection is actually a curve because now y is frozen,
y is equal to b, and so it's out of the game.
So the game now is between x and z.
And it's the same game -- it's a game, same game for
functions and one variable.
So, in fact, I can draw this curve as a graph for the
function and one variable x, which I get by
substituting y equals b.
This is, by the way, the function which I called
g of x on that board.
So let me draw this.
So now, as I said, I only have x and z variables remaining,
and this blue curve is going to look like this, and of course,
it continues somewhere, but since I didn't draw it on the
big picture, I'm not going to draw it much beyond
the first quadrant.
It would be tempting to draw it here.
I know you might be wondering why am I drawing it like this
and not like this, but the point is you have to look at it
not from this angle but from the back of the blackboard
where x and z become the oriented coordinate system.
You see what I mean?
You have to turn this, you have to turn this coordinate system
like this, 90 degrees in this way to make the x to go to the
and z to go vertically up.
Go up.
If I look like this it would be x will go here,
so I don't want it.
I want to look like this.
And that's what I will see.
If I turn this this is what I will see.
So this is, in fact, the graph, this is a graph of what I
called g of x, which is obtained by taking f of xb.

It is part of the surface which is a graph of the entire
function, but I have frozen one of the variables, so actually
was able to reduce my problem to the problem of
function one variable.
I get a graph of a function one variable, namely
g equals g of x.
And now I can calculate for the value of x equal to a.
I can calculate the slope of the tangent line.
So let me draw this tangent line wide so there is this
tangent line and it has a slope, and the tangent of the
slope is a derivative g-prime at point a, which is what we
call the first partial derivative of a function
f at the point ab.
You see what I mean?
I'm doing geometrically here precisely what I did
algebraically on this board.
Algebraically I freeze one of the variables, I get a function
one variable, it's called g of x and I differentiate
it, fx equal a.
Now I'm doing the same geometrically.
Freezing the second variable means intersecting the graph
with the plane y equals b, then I'm down to two variables.
I can look at it as a graph of a function one variable, and
then I look at the tangent line to this graph at this point,
and I measure the slope on this tangent line.
The slope is that derivative which we were looking for,
namely the partial derivative with respect to x.
Any questions?
So let me draw it now on this board.
So the tangent line that I drew over there is going
to look like this.
That's the tangent line.
It's not the entire tangent plane, it's one line
on that tangent plane.
If you think of the tangent plane as this -- it doesn't
want to turn anymore.

I didn't know that it had some knobs and some
things to play with.
If this is the tangent plane, then I have drawn just one
line on it, and that's the intersection with the xy plane.
Maybe it better like this to draw it.
If you think of the -- not the xy plane, sorry, it's
a plane of y equals b.
So think of the y equal b plane as this vertical, kind of
vertical plane, then that's the tangent line that I got.
So this green plane is not yet on the picture.
I have not drawn it yet.
I have only drawn this.
And so now I'm going to draw the second one, so that's my
point, that's what point -- yellow.

And now I will talk about the second tangent line, which
corresponds to freezing the other variable.
So this corresponds to y equals b.
And this corresponds to x equals a.
So, I intersect now with the plane x equals a, and I'm going
to use a different color for this, so it's going to
be something like this.
And now, so this is the red curve is intersection with
the plane x equals a, with intersection with the
plane x equals a.
So x equals a would be like this.
So what's the tangent line to this.
Well, this
already looks like tangent line, but I want to erase this
part so we don't get confused.
The second tangent line is going to look like this, and
that's the second white line which I drraw on that board.
So if you want the tangent plane looks like this,
this is a tangent plane.
The tangent plane is spanned by both of these lines,
both of the tangent lines.
So the tangent plane is the green board, but I can not put
it there, so think of the graph as being a kind of a part of a
steer which is just below this plane, so that this plane just
touches it, just the tangent one to, tangent plane
to this graph.
But on this tangent plane I can clearly distinguish two lines.
One of them corresponds to the intersection with y equal
d plane and other one with x equal a plane.
When I intersect
with those planes, I get pictures just like this.
This is the first one and here's a second one.
In the second one, I have two variables left, also, but those
variables are y and z because I have fix ax now -- x is equal
to a, but y remains a free variable.
So I'm talking about this red curve, and this red
curve looks like this.
The way I have drawn it, it looks almost identical, the
blue and red, but I just did it to simplify the picture.
Of course, in general they're going be totally different.
And on this curve, I pick the value of y equal b -- so, this
is my yellow, emphasize -- I forgot to put the yellow in
that place, put I'm sure you understand it.
And then the tangent line is here, and then maybe let's
call this theta prime.
The tangent of the theta prime -- ah, maybe I should say that
this is a graph, z equals h of y, which is f of a,y.

And the tangent is the derivative h-prime of
b, which is fy of ab.
So this line, this tangent line I have drawn here is the
tangent line to the red curve on the graph.
So that's the picture.
So to summarize this, in the case of one variable you only
have one derivative, and that one derivative corresponds to
the slope of the tangent line to the point, to a given point.
In two variables you have a tangent plane, and what partial
derivatives give you, they give you the slopes of two tangent
lines which belong of this plane, namely the lines which
are obtained -- like these two lines -- the two lines which
are obtained by intersecting the tangent plane with
the plane y equal b, or the plane x equal a.
That's the idea.
So we just kind of look at from two different angles.
We'll look at this tangent plane from two different
angles, and what we get is two different lines, and once
you get lines you can talk about slopes.
You can not talk about the slope of a whole plane --
a plane doesn't have a slope, lines have slopes.
And so, there are sort of two independent slopes that we can
talk about, one with respect to x and one with respect to y,
and they correspond to the two partial derivatives, one with
respect to x and one with respect to y.
Any questions?
Not a good notation, because prime is for derivative
-- let's go with tilde.
I just wanted to distinguish from the other theta, I
wanted to make sure it's not the same as that theta.
But put in prime is like the worst possible notation because
then -- and it looks like I'm taking derivative of
theta, which I'm not.
Even better to call it something else, epsilon maybe.
Alpha, that's a good compromise.
It's Greek but not epsilon or delta, which are taboo.
So what else can we do?
In the case of a function in one variable, we could also
take further derivatives.
We don't have to take just one derivative, we can take the
second derivative, third derivative and so on.
So it's natural to ask whether we can do something similar for
functions in two variables.
And the answer is yes.
We can also take, for example, a second derivative.
So, in other words, we start with the function f of x and y,
and then we can take the first derivative with respect to x
and we can take derivative with respect to y.
So we got two new functions, which I'll give you an
example of this for a particular case of f.
So both of them are also functions in two variables, so
we can again apply the same procedure and do partial
derivatives for this function.
Then if we go this way, we obtain ffx of xy.
What do I mean by this?
I mean that I take f sub x, this function, and I take
the derivative with respect to x one more time.
That means again freezing y and then taking derivative
with respect to x.
OK, if I go this way, I get fyy of xy, which means I take fy,
the derivative of f with respect to y, and I take the
derivative to y one more time.
But, of course, I can also do mixed derivatives.
For example, here I can take this and can take
the derivative of this with respect to y.
So that I will denote as fxy of xy.
That means taking first the derivative with respect to
x and then respect to y.
But I can also do fyx, which is first with respect to
y and then respect to x.
And then, of course, the natural question is whether
I get the same answer if I apply these derivatives
into different order.
The question is whether these are actually equal.
And in my example, in my example, let's
see if I remember.
I think it was f of xy, f to the 5 -- what was it? xy cubed
plus cosign x equal to y.
So what will be -- so let me write f of xy, so then
derivative with respect to x was 5x to the 4 plus y cubed
minus sign xe to the y.
If I do one more derivative I get 20x cubed.
Now y cubed is a constant as a function of x.
I view y as a parameter, so it doesn't depend on the x,
therefore, the derivative vanishes, so it disappears.
And then I take one more derivative of sign, I get
cosign x times e to the y.
On the other hand, I can take derivative with respect to y,
I get 3x y squared plus cosign xe to the y.
One more derivative -- 6xy plus cosign xe to the y.
And now the most interesting thing, I can -- let me do it
like this so that we don't lose track of where we are.
So first we take the derivative of this with respect to y, this
disappears, this becomes 3y squared minus sign xe to
the y, so that's this way.
And if I go this way -- I'm sorry, not this way, this way.
I get the same, right, I get 3y squared and I take the
derivative with respect to x, so minus sign x e to the y.
So clearly I get the same answer.
So this actual general result, which
is called Clairaut.
Or I guess if I pronounce as a French, with a French
accent, it will be Clairaut.
So the Clairaut theorem, which says that uner favorable
conditions, which is essentially the condition that
in a small neighborhood of a given point, you have all
partial derivatives which are continuous function up
to the second order.
Under these favorable conditions, the two mixed
derivatives are the same.
So this is, in fact, Clairaut's theorem under some
conditions of continuity.
Which will, in all our examples, this
will be satisfied.
So this is actually great because what it means is that
if you think of a way of doing partial derivatives for
function two variables as explained where if you go this
way you differentiate with respect to x, and if you go
this way you differentiate with respect to y.
Right, you could do that.
We could continue this picture.
I can go one more step will be like fxxx, or I could go this
way and it will be fxxy -- always the last one, the
new one is the last one.
And then if I go more, it will be, for example, fyx.
But the point is that it doesn't matter in which order
you take, what matters how many times you differentiated x and
how many times you differentiated y.
So for instance, fyx is equal to fxy, but likewise, fxyx is
the same as fxxy, the same as fyxx, again, under favorable
conditions when functions in question are continuous
and differentiable.
So all that matters is not the order, but the number of times
you differentiate with respect to x and y, which is kind of
nice so it has the same communicative structure as the
structure that you have for the variables x and y themselves.
In fact, differentiation is in some sense the process which is
opposite to the process of multiplication by x or y.
So that you have two operations of multiplication by x and y,
but you also have two operations of differentiation
by x and by y.
And multiplication by x and y commute -- two multiplications
commute, and the derivatives also commute.
Which, by the way, actually is kind of a better notation
for this iteration.
Because for now the iteration is denoted by inscribing
this additional subscript next to the function.
But there is another notation, so I go back to this, this is
our notation, but another notation is df dx.

Also, if you wish at ab, but it doesn't have to be.

And likewise, a notation for the second derivative is dfdy.

So this is a particular notation.
This is not be confused -- this should not be confused with the
straight d, with just a straight letter d.
It's not the same.
In fact, this actually makes sense, which I will
explain on Thursday.
I will finally explain what dx mean.
But this by itself doesn't make any sense, this
makes sense, zdx.
zdx is a procedure, which you can apply to a function
and it gives you first partial derivative.

This is an operation which -- in the case of one variable
we just denote by prime.
Also in the case of one variable we write, in the case
of one variable we write, you have f of x, you write f-prime
of x or you write df dx.
But now we cannot write like this, as I will explain in
more detail on Thursday.
To differentiate you have to specify in which direction
you differentiate and this is one way to do it.
Say you choose to differentiate with respect to the x direction
and then you get this.
But this is not to say -- the numerator by itself doesn't
make any sense as a notation, and the denominator also
doesn't make any sense.
Only these two things together make sense.
This is a notion of partial derivative, and likewise, you
have the notion of partial dervatives with respect to y
which makes any function in its derivative with respect to y. f
sub x and this is f sub y.
This on the other hand, df, is an entire different
object, the differential.
This is an entirel different object, it's called
a differential.
And it's not the same, likewise, it's not df.
So this is not even a letter, if you think about it.
It's not even a letter of any reasonable alphabet.
It's just a mathematical notation for
partial derivative.
So this, of course, begs the question as to what
is the differential.
What is the differential and what on earth does this mean?
Because this is something we've been using quite a lot,
but never really -- I've never really spelled out
what we mean by this.
But actually it has a very precise meaning, differential
and dx and dy, and this is what we're going to discuss next.
In fact, I have about five minutes left, so I'll give you
a little preview of what's coming on Thursday, and it's
really a very important, that's a very important subject, which
unfortunately has been made really, really obscure
by a very unfortunate choice of notation.
It's a very bad notation makes it very obscure and very
difficult to understand.
So I remember when I was learning this for the first
time it was impossible to understand.
So it took me a long time to figure it out, but I'm happy to
share it with you, I have to tell you, because it's actually
very simple, and we already know everything that we
need to know about this.
So what I want to do is just tell you just a little bit,
just a couple things about it.
And I will, as always, I will start with the function one
variable, var that's already a very good example where you can
understand what the differential is and what all
this notation means.
So in fact, I shouldn't have erased it because I'm going to
draw it again, but I just wanted to draw it in a slightly
different way, in a more -- the way I usually draw which
is kind of a -- this is the optimistic view of
reality as it goes up.
The other one's down, that's why I raised it.
So, we talk about tangent lines, and we talk about
importance of tangent lines, and the importance of tangent
line really is that it gives you very useful approximation
to a complicated function on a very small scale.
So the differential, the differential really is
the function whose graph the tangent line is.
So the funny thing is that we talk about a function in one
variable, so in this case let's say you have a function f of x.
In this yellow curve, it presents the graph of this
function, that is, the set of solutions to the equation
y equals f of x.
So for this function we have two objects -- we have the
function named f, and we have the graph which is the yellow
curve, we have two objects.
And then we talk about the tangent line, and tangent line
of course we understand geometrically very clearly,
we choose a particular point, so let's say
x, 0, right, but we never talk about the function which
gives us this tangent line as the graph.
Somehow we usually ignore this question.
So the yellow curve is the graph of this function.
But the tangent line is also graph of a function in a much
simpler function which is actually a linear function,
and what this function is it is a differential.
This one is a graph also graph of a function, namely df.
This is what we mean by df.
Let me write it in words.
Namely the differential of f at this point.
That's what the differential is.
The only subtle point is that for the differential we shift
the coordinates -- we choose a new coordinate system where
the origin is at our point, that's all.
In other words, we view now this line as a graph with
respect to a new coordinate system where the origin is not
here -- it's not at some arbitrary point, but actually
at the point which we are analyzing.
And if you write down the function whose graph this
tangent line is, you will have precisely the differential.
That's how it works for functions in one variable.
And now it's absolutely clear what will happen for
functions in two variable.
For functions in two variable, we are going to look at the
tangent plane to this graph, which is represented here, or
if you wish, that's the tangent plane which I was
talking about.
And then you will think of this tangent plane also as a graph
of a function, of a linear function, and that linear
function is a differential of a function in two variables
you started with.
So that's the short version of this, and I will give you
more details on Thursday.