14. Introduction to the Four-Vector

Uploaded by YaleCourses on 22.09.2008

Professor Ramamurti Shankar: Let me remind you
that everything I did so far in class came from analyzing the
Lorentz transformations. And you guys should be really
on top of those two marvelous equations, because all the stuff
we are doing is a consequence of that.
I won't go over how we derive them, because I've done it more
than once. But I remind you that if you've
got an event that occurs at x,t for one person,
and to a person moving to the right, at velocity u,
the same event will have coordinates x' = x -
ut/√ and t^(') = (t - ux/C^(2))/√.
This is it. This is the key.
From this, by taking differences of two events,
you can get similar equations for coordinate differences.
In other words, if two events are separated in
space by Δx for one person and Δx' for
another person and likewise in time,
then you get similar formula for differences.
So, differences are related the same way that coordinates
themselves are. But I will write it anyway,
because I will use it sometimes one way and sometimes the other
way. Even this one you can think of
as a formula for a difference, except one of the coordinates,
or the origin. This isn't generally,
if you've got two events, not necessarily at the origin,
then there's spatial separation and time separation are
connected in this fashion. So, I put this to work.
I got a lot of consequences from that.
You remember that? For example,
I said, take a clock that you are carrying with you.
Or let's take a clock that I'm carrying with me and let's see
how it looks to you. And let's say it goes tick and
it goes tick one more time. So for me, the time difference
between the two ticks will be, let's say, one second.
It's a one-second clock. The space difference is zero
because the clock is not going anywhere for me.
So for you, Δt' = 1s/√ which is more than
one second. So you will say,
hey, the time your clock ticked one second, I really claimed two
seconds have passed. Therefore, you will say my
clocks are running slow. And it's easy to see,
from the same equations, that if you replace the prime
and unprimed ones, simply change u to
-u, nothing changes. The clock that you think is set
best with respect to you will look slow to me.
Then, I showed you how when you wanted to measure the length of
a stick which is traveling past you,
you get your assistants on the--who are all lined up on the
x axis to measure the two ends of the rod,
or any particular time they like.
Take the special difference. That's the meaning of the
length of a moving rod. You've got to measure both ends
at the same time. So, you make sure the two
measurements are done at the same time, so Δt = 0.
The distance between them is what I call the length of the
rod. You are moving with the rod.
And as long as you're moving with the rod,
the rod has got two points which are separated by the
length L_0. And one thing occurs at one
end, another thing occurs at the other end.
The spatial difference between them will be simply the length
of the rod in your frame of reference.
So, l will be L_0 times--if
you bring the square root to the other side,
L will be L_0√ whereas
Δt' would be some Δτ under this frame,
divided by the square root.

Okay. Now, the important result I
showed you was a lack of simultaneity.
It's not absolute. So, I started with unfortunate
example of twins born in Los Angeles and New York,
and you guys were pointing out that no normal woman could do
that, except the one from "your momma" jokes.
"Your momma" jokes; your momma is so big she can
have a kid in New York. So, I don't want to deal with
such cosmological objects right now.
So, I will say, pick a better example.
Two things happen. Something here,
something there, separated by distance.
People cannot agree that they were simultaneous.
Now, you've got to say, it is very surprising,
people cannot agree on length difference between two events.
For example, New York and Los Angeles;
over 3000 miles apart. We think you can move in a
train; you can move in a plane.
You all have to agree on the separation, but you don't.
That's just a consequence of now the new relativity.
In the old days, before I did relativity,
I was taking just the good old x and y
coordinates. And I said you can take a
point, you can assign to it some coordinates x and
y. Then, somebody comes along with
a rotated axis, but no one seems to be troubled
by the fact that the same point is now given a new set of
coordinates by this new person. Or take a pair of points on
this line. For that person in the rotated
system, these events have the same y' coordinate.
Because you see, this is the y '= 0 axis.
It occurred at the same y', but for me,
they're not the same y coordinate, because that has
that y coordinate; that's a different y
coordinate. So, we are fully used to the
fact that having the same x coordinate for two
events is not an absolute statement difference on the
frame of reference. Having the same y
coordinates is also not absolute.
But somehow, in the case of time,
we used to think that time differences are absolute and
spatial differences are absolute.
And what we are learning is no, they are just like the x
and the y. Okay.
So, I'm going to do one more thing with the Lorentz
transformation. This is pretty interesting.
Let's take this equation for time difference;
Δt' = Δt - uΔx over c^(2) divided by this.
So, let's take two events. First, something happens;
then, something else happens. And Δt is a separation
in time between them, t_2 -
t_1. Let's say t_2 -
t_1 is positive, so t_2,
the second event, occurred after the first event,
according to me. How about according to you?
Well, Δt' doesn't have to have the same sign as
Δt, because you can subtract this number,
uΔx over c^(2); Δx can be whatever you
like. Therefore, you can find that
Δt' could be negative for you.
And you've got to understand, that's a big deal.
I say this happened first and that happened later.
Then, you say, no, it happened the opposite
way. Now, this can lead to serious
logical contradictions, especially if event one is the
cause of event two. This is a standard example both
in special and general relativity that people talk
about; event one is somebody's
grandmother is born. And event two,
that person is born. We say the birth of the kid
takes place long after the birth of the grandmother,
according to me. What if, in some other point of
view, the kid is born, but the grandmother is not yet
born. And somebody goes and
assassinates the grandmother. This is--In fact,
this is called killing the grandmother.
These words are taken right out of textbooks.
Why this kind of violence is inflicted on grandmothers I
don't know, but it's a standard example.
All logical contradictions involving time travel have to
come--have to do with coming back and doing something.
So, here's an example. The grandchild is born.
Grandmother is not yet born, and something is done,
you know, to prevent her from being born or two seconds after
she is born, she is hit by the mafia.
How do you explain the grandchild?
Where did the grandchild come from?
Because the cause has been eliminated.
Or, event 1, I fired a bullet. Event two, somebody is hit.
And you go to frame of reference in which the person
has been hit and I haven't fired the bullet.
So, you come and you finish me off.
So now, we've got this person dying for no apparent reason
because the cause has been eliminated.
That simply cannot happen. Einstein recognized that there
is a limit to how far he can push this community,
and he conceded that if A can be the cause of
B, you better not find an observer for whom they occur in
reverse order. Because if the cause occurs
after the effect, then there is some time left
for somebody to prevent the cause itself from happening and
we have an effect with no perceivable cause.
So, we want to make sure that Δt' cannot be reversed
whenever the first event could be the cause of the second
event. So, you go and ask this
equation: if Δt is positive, when will Δt'
be negative? That is simple algebra;
you want this term to beat that term.
So, that'll happen if u, Δx--let me put a
c here, another c here,
is greater than Δt.

If that happens, things are going to happen
backwards, right? So, let's modify that a little
bit by putting this c over here, and rewrite it
finally as u over cis bigger thancΔt over
Δx. Let's understand what this
equation means. This equation means that if
there were two events separated in time by Δt and in
space by Δx, I can find an observer,
at a certain speed u, so that u divided by
c, if it exceeds this number,
for that observer, the time order of the events
will be reversed. Yeah?
Student: [inaudible]
Professor Ramamurti Shankar: It's not yet clear
the velocity is greater than c.
It depends on what's happening in the top and bottom.
You have to ask. What is cΔt?
Remember, Δt is a time between two events.
cΔt is what? It's the distance a light pulse
can travel in that time. The denominator is a spatial
separation between those two events.
Event 1 is here, event 2 is there;
cΔt is how far a light signal can travel in the time
separating these two events. If cΔt looks like that,
then there's enough time for a light signal to go from here to
here. In that case,
cΔt would be bigger than Δx,
but then the velocity that you want of your frame will be
bigger than one, and that's not possible in
units of speed of light.

The only time you will get a sensible value for the second
frame, namely with u/c 1 would be if cΔt
; that means, here's the first
event and here's the second event.
A light signal could have only gone from there to there in the
time between the events. So, it is saying that if two
events are such that there was not enough time for a light
signal to leave the first event and arrive in time for the
second event, then we can actually find an
observer with the sensible velocity for whom the order of
events is reversed. Therefore, what have we learned
from this? We learned from this that it
should not be--if there is enough time for a light signal
to go from one event to the other event,
then we will not play around with the order of events.
But if there is no time even for the light signal to go,
the system in fact allows you to see them in reverse order.
But we have seen that events which are causally connected,
the order cannot be reversed or should not be reversed.
So, what this is saying is, it should be impossible for an
event here to affect a second event by using a signal faster
than light. In other words,
we are going to say that if there wasn't enough time for a
light signal to go from this event to that event,
then this event could not have been the cause of that event.
We will demand that. Therefore, what the theory for
it to be make logical sense is, no signal should travel faster
than the speed of light. Because if you had a signal
that can travel faster than the speed of light,
then the first event could have been the cause of the second
event, because you send the signal and
the signal may blow something up, but that is not enough time
for light to get there in the same time.
But if you allowed such things, then you will find a frame of
reference with a velocity perfectly sensible in which they
occur backwards, and cause and effect will have
been reversed. So, the answer to the whole
thing, in case you're still struggling with this issue,
is the following. What the theory of relativity
demands is that it should be impossible for events to
influence other events with signals traveling faster than
light. Once you accept that,
there is no logical contradiction in the theory,
because the only time events are reversed is in when there is
no time for light signal to go from here to there.
That means there was no way this could have been the cause
of that. For example,
when I fire a bullet, the cause is my firing of the
bullet, the effect is whatever happens to the recipient.
And the whole thing takes place at the speed equal to speed of
the bullet. In that case,
cΔt is definitely bigger than Δx,
because in the time it took the bullet to go from here to there,
the light pulse could have also gone there and beyond.
In that case, you'll never find an observer
at a speed less than that of light for whom the events will
appear backwards. So, the special theory of
relativity is another way to ensure that everything is
consistent is to demand that no signal travels faster than the
speed of light. No energy, no signal can travel
that fast. But this leads to something
very interesting. Our view of space-time is now
modified, you see? In the old days,
here is space and here is time. And let's say (0,0) is
where I am right now. I am at the origin,
my clock says zero; I'm here right now.
Any dot I pick here is in my future, because t is
positive for the events. They have not happened yet.
Any dot I draw here is an event in my past, because it occurred
at an earlier time. So, in the Newtonian view of
the world, there is something called right now,
this horizontal line. And everything above it is
called later or future. Everything below it is called
past. But in special relativity,
after you've taken into account a relativistic thing,
let's draw here an axis ct.
So, whenever you have another event so that ct >
x, there is enough time for a light signal to go from here
to here. That's the meaning of saying
ct > x. This is the line ct = x.
This is the line ct > x.
This event is in the future of this event according to me,
and also according to anybody else.
In other words, Δt is positive for me,
but if you go back to these equations, since cΔt >
x, you will never find anybody who said this event
occurred earlier than this event.
Another way to understand it is if I want to cause that event,
suppose that is some explosion going off there.
I send a signal that goes and makes that happen.
That signal has to travel slower than light to get there.
Therefore, it could have caused the event.
And therefore, there is no screwing around
with the order of that event. That is in my future according
to me and according to all observers.
If you take an event here, so this is called--this whole
event north of this 45 degree line is called the "absolute
future." By "absolute," I mean future of
me not only according to me, but according to all observers.
For every observer, this will occur later than
this. It won't be later by the same
number of seconds, but it will be later.
So, the Δt between this event and this event is positive
for me, will be positive for all people.
They will all agree this occurred after this;
therefore, the effect appeared before the cause.
Sorry, the cause will appear before the effect.
And it is not really necessary that this event be caused by
this event. It's only necessary that it
could have been caused by this event because not everything in
the universe is caused by some other event.
It can just simply happen. We have heard that;
stuff happens. So, stuff happens here with no
obvious cause, as long as this in the future,
according to me; by this definition,
it's in the future according to all people.
So, this is called the future because you can affect it.
For example, if I heard something terrible
is going to happen at this point, I can send my guys to go
over there and do something. They have to travel at a speed
that is less than the speed of light so they can affect it.
These events are called the "absolute past."
What that means, any event here could have been
the cause of what's happening to me right now because from that
event, a signal can be sent to arrive
where I am right now at a speed less than the speed of light.
This line, which is called the light cone, is a borderline case
where you can communicate from here to here using a light
signal. So, that's also considered a
future. But what about an event here?

Suppose this distance is two seconds times the velocity of

That has not happened yet. And suppose I go there and open
my envelope and say something terrible is going to happen at
this point. There is nothing I can do.
There's nothing I can do. In the Newtonian days,
there's something you can do. Tell someone else to really
hurry up and get there and do something.
But now you cannot do that because for that someone to
leave you with the instruction to do something about this
requires that person to travel faster than light,
and that's not allowed. So, even though you know things
can happen here in the future, and you have knowledge of the
fact that someone is planning to do something there,
you cannot get there.

Okay? That's a very important thing
even for people who are going to law school.
And all of you guys are not going into physics.
But suppose you're going to law school.
You've got the DNA defense, right?
My client's DNA doesn't match. Here's another defense.
If your client was accused of doing something here,
and was last seen here, you can argue that my client
was outside the light cone. That's called outside the light
cone defense, and it's absolutely
water-tight. It's better than anything
because if that event is outside the light cone of your client,
the client cannot be held responsible, because your client
will have to send a signal faster than light,
and that law is unbroken.

So, what about the status of an event like this?
The status of this event is, I can actually find other
observers moving at a speed less than light, but for whom this
event occurs before this event. So, order of events can be
reversed. But it will not lead to logical
contradictions because we know the two events are not causally
connected. So, space and time,
we just divide into upper half plane and lower half plane.
The future and past, and a tiny sliver called
present now divided into three regions;
the absolute future that you can affect, the absolute past
whose events can affect you, and this is--now,
it doesn't have a name. If you want,
you can call it the relative future according to you,
but I'll find somebody to whom that occurs before this one.

Okay? That's also a consequence of
the Lorentz transformation. So, you can see that the
equations are very deceptively simple.
In fact, they're a lot easier looking than some equations with
angular momentum, right?
But the consequences are really stupendous.
They all follow from that equation.
And of course, if you invent a theory like
this, you've got to make sure that there are no
contradictions, right?
Suppose you find a theory, you're very happy with
everything; velocity of light is coming out
right. But somebody points out to you
that the order of events can be reversed.
And you've got to agree you'll be in a panic because,
how can I reverse the order of events?
And the theory is so beautiful; internally, it says you can
reverse the order of events if they could not have been
causally connected. Where the causally connected
means a signal traveling at a speed less than light could not
have gone from the first or the second event.
Then, the theory does not allow them to be--If a signal could
have gone, it allows them to be never reversed.
If a signal could not have gone, it does say you can find
people for whom events occurred in the reverse order.
Alright. So, this is the second thing,
and it's called the light cone for the following reason.
I've only shown you an x coordinate.
If we had a y coordinate coming out of the blackboard,
the surface would look like a cone.
That's why we call it the light cone.
Sitting at a point in four-dimensional space-time,
there's a cone. And all points in that forward
light cone are the events you can affect.
All points in the backward light cone are events that can
come and get you right now. And the rest of the thing
outside the two cones, you cannot do anything about,
even though you know something is going to happen.
And they cannot do anything to you.
You open an envelope, and it says somebody here is
planning to do something to you. You don't have to worry,
because that person cannot get to you in the time available.
Because the fastest signal is the light signal,
and that won't get there. And neither will anything else.
Alright. Now, we're going to do
something which is theoretically or mathematically very pretty.
So, for those of you who like mathematical elegance,
this is certainly the best example I can offer you,
because it is simple and also rather profound.
But it's completely by analogy. The analogy is the following.
We have seen that when we rotate our axis,
as I've shown there, that the x' is not the
same as x related by this formula.
And y' is also not the same as y related by this

Nothing is sacred. Coordinates of a point are not
sacred. They are just dependent on who
is looking at them from what orientation.
But even in this world, we noticed that x^('2) +
y'^(2) is the same as x^(2) + x^(2).
Namely, the length of the point connecting the origin to you,
or the distance from the origin is unaffected by rotations.
This is called an invariant. This doesn't depend on who is
looking at it. Everyone will agree on this.
But they won't agree on x and they won't agree on
y, but they will agree on this.
So, it's reasonable to ask in the relativistic case,
where people cannot agree on time coordinate or space
coordinate, maybe the square of the time
plus the square of the space will be the same for two people.
One can ask that question. So, when we look at that,
we'll find out that that's not the case;
t^(2) + x^(2) is not invariant.
But even before you do that, you should shudder at the
prospect of writing something like this.
Right? What's wrong with this?
Student: [inaudible]
Professor Ramamurti Shankar: Good.
Everyone's on top of the units here.
So, you cannot add t^(2) + x^(2).
There is no way this can be anything.
So, we know how we have to fix that.
We've got to have either both coordinates in space-time
measured with lengths or with time.
The standard trick is to do the following.
Let's introduce an object with two components.
I'm going to call it X. The first component of that is
going to be called X_1--is going
to be called X_0.
The second is going to be called X_1.
X_0, I mean, X_1 is
just our familiar x. X_0 is going
to be essentially time, but multiplied by c.
Why do I switch from t and x to zero and one?
It is just that if you want to go back to more coordinates,
if you want to bring back y and z,
then I just mention for future use that in four dimensions you
really have X_0,
X_1, X_2 and
X_3. That is a shorthand for
X_0 and what you and I used to call the
position r of a particle. It's just that in one
dimension, the vector r becomes a number x,
and I want to call the number X_1 because
it's the first of the three coordinates.
And if you're doing super strings, you can have ten
coordinates, and one will be X_0 and the
nine will be spatial coordinates.
So, we like the numerical index rather than letters in the
alphabet because we run out of letters.
But you don't run out of these subscripts.
So, I'm writing it purposely in many ways, because if you ever
go read something or you're involved in a lab project or the
group is studying something and you want to read the paper,
people who refer to coordinates in space-time in many ways.
Some will call it--Some people like to write the X the
following way; X_1,
X_2, X_3,
X_4, but X_4 is
ct. That's why you call that the
fourth dimension. But still, the fourth
dimension, but you can either put it at the end of that first
family of three or at the beginning.
So, it's very common for people to use either notation.
I'm just going to call it X_0.
X_0 is just time, okay, multiplied by
c so that it has units of length and multiplying by
c doesn't do anything. Everybody agrees on what
c is. So, we're all multiplying our
time coordinates by c. Alright.
So, let's ask the following question.
What does the Lorentz transformation look like when I
write it in terms of these numbers?
So, first take x' is x - ut divided by the
famous square root. But t is not what I want;
ct is what I want. So, I fudge it as follows;
I write it as (x - u/cct)/(1 -
u^(2)/c^(2)). This I'm going to write as
x is now called X_1,
and I'm going to introduce a new symbol β here.
This guy is called X_0/1 -
β^(2). But β is universal
convention for u/c. If all the velocities are
measured as a fraction of the velocity of light,
then β is a number between zero and one.

So, the final way I want to write the Lorentz transformation
for coordinate is that X_1' =
(X_1 - βX_0)/√.
So, you have to get used to this β.
I mean, you're following nicely what I was saying in terms of
velocity u, but you have to get used to
this β. We won't use it a whole lot,
but I put it here so you can recognize it when people refer
to it in the literature, when you go read something.
What's the transformation law for time?
Remember t' = (t - ux/c^(2))/(√1 -
β^(2)). β^(2) is clearly what's
sitting downstairs. u/c is β.
So, what do you think we should do?
We should multiply both sides by c.
Because if you multiply both sides by c,
put a c there, put a c there and get
rid of this one, then you find
X_0' is (X_0 -
βX_1)/(√1 - β^(2)).
So, let me write that here. Now, you see the relationship
with the coordinates is nice and symmetric.

If you write in terms of x and t,
the coordinates transformation law is not quite symmetric.
And that's because one has units of length and one has
units of time. But if you rescale your time to
turn it into a length, then the transformation laws
are very simple and symmetric between the two coordinates and
space-time. There are also the other
coordinates; the y direction,
X_2' = X_2 and in the
z direction, X_3' =
X_3. In other words,
the length is perpendicular to the motion, or something you can
always agree on. One can talk and talk about why
that is true. Basically, what we are saying
is, if you are zooming along, and you drew a line that says
y = 1 and originally we agreed on it,
we will have to agree on it even when I start moving
relative to you, because you've got your own
line. And there's no reason your line
should be above my line or should be above your line or
below your line. If the lines originally agreed,
there is no reason why my line is above yours or your line is
below mine, because by symmetry,
there's no reason why one should be higher than the other.
And the reason this is required is that the two lines,
which last forever, can be compared anywhere you
like. Unlike the rods,
which are running away from each other and cannot be
compared, the line here which says y
= 3 has got to be y = 3 for both of us.
So, the transverse coordinates are not modified.

So, that's why I don't talk about them too much.
All the interesting action is between any one coordinate along
which the motion is taking place and time.
So now, let's ask ourselves maybe X_0'^(2) +
X_1'^(2) is the same as this square plus that
square. Well, let me just tell you that
I'm not going to try something that I know will not work.
In other words, you might think
X_0'^(2) + X_1'^(2)=X
_0^(2) + X_1^(2).
Well, that just doesn't work. That doesn't work because
1/√ is not a cos θ, and β/√ is not
a sin θ. You cannot make them cosine and
sine of something. If you could, that'll work.
But it doesn't work. But I'll tell you what does
work. Let's take
X_0'^(2) - X_1'^(2),
the square of the time coordinate minus the square of
the space coordinate with a minus.
So, let's try to do this in our head by squaring this and
subtracting it from the square of this one.
Downstairs, I think we all agree you get 1 - β^(2)
because the square root is squared.
How about upstairs? First, you've got to square
this guy, this guy will give you X_0^(2) +
β^(2)X_1^(2) - 2βX_0X
_1. From that, I should subtract
the square of X_1'^(2).
That'll give me -X_1^(2) -
β^(2)X_0^(2) + 2βX_0X

Just simple algebra. And these cross terms cancel
out. And you notice I got an
X_0^(2)(1 - β^(2)) -X_1^(2)(1
- β^(2)), right?
This algebra, I trust you guys can do at
home. And the result of this is going
to be X_0^(2) - X_1^(2).

That's very nice. It says that even though people
cannot agree on the time or space coordinate of an event,
just like saying people cannot agree on what is X and
what is Y, they can agree on this
quadratic function you form out of the coordinates.
But it's very different from ordinary rotations,
where you take the sum of the squares;
here you've got to take the difference of the squares.
And that's just the way it is. Even though time is like
another coordinate that mixes with space, it's not quite the
same. In other words,
if I brought back all the transverse coordinates,
then you will find X_0'^(2) -
X_1'^(2) - X_2'^(2) -
X_3'^(2) = X_0^(2) -
X_1^(2) - X_2^(2) -
X_3^(2). This you can do because
X_1' individually is equal to
X_1. So, you can put them on both
sides. This is the actual
four-dimensional result for people who really want to leave
the x axis and move freely in space,
which you're allowed to. This is the invariant.
This is the object that everyone agrees on.
Allow me once again to drop this.
Most of the time, I'm not going to worry about
transverse coordinates. So, this is the analogue of the
length square of a vector, and it's called the space-time
interval. It's not the space interval;
it's not the time interval. It's a space-time interval
between the origin and the point (x,t).

And this is denoted by the symbol--there's no universal
formula, but let's use s^(2) for space-time

Notice that even though it's called s^(2),
s^(2) need not be always positive, definite.
You can take two events, but if the time coordinate is
the same, or the time coordinate is zero,
the space coordinate is not zero, then s^(2) will be
negative. Whereas the usual Pythagoras
length square, x^(2) + y^(2) is
positive definite, always positive,
the space-time interval can be positive or negative or zero.
It's positive if X_0 can beat
the sum of the squares of those guys.
It's negative if they can beat X_0 and it's
equal to zero if X_0^(2) equal
to that square.

So, if you want, if you go back to this diagram
I drew here, this is X_1.
This is X_0 = ct.

These are events for which s^(2) is positive.
These are also events for which s^(2) is positive.
These are events for which s^(2) is negative.
These are events in which s^(2) is zero.

So, whenever s^(2) is positive, we have a name for
that. It's called a time-like
separation. Well, it's called time-like
separation because the time-part of the separation squared is
able to beat the space-part of the separation.
It's got more time component than space component.

Now, you can also do the following.
Let me introduce some rotation now.
Let us agree that we will use X for a space-time with
what's called a four-vector. A four-vector has got four
components. And the components are
X_0, and since I'm tired of writing,
X_1, X_2,
X_3, I will call it r.

So, in the special theory of relativity, what you do is,
you take the three components of space and add one more of
time and form a vector with four components.
And we will agree that the note symbol for a space-time vector
is an X. Now, I've been careless in my
writing. Sometimes I use x and
X, but you guys should be a little more careful.
X like this, but no subscript,
will stand for these four numbers.
It is a four-vector. It's a four-vector;
it's a vector living in space-time.
And when you've got another frame of reference,
the components of four-vector will mix with each other.
Then, we're going to define a dot product.

It's going to be a funny dot product.
It's going to be called X∙X.
Usual dot product was some of the squares of the components.
But this funny dot product will be equal to
X_0^(2) minus the length of the spatial part,
squared. Student:
[inaudible] Professor Ramamurti
Shankar: Pardon me? Student:
Is that a small x? Professor Ramamurti
Shankar: This is an X.
These are all small. X is the name for a
vector with four components. It's a position vector in
space-time. These are--This is the same as
writing X_0^(2) - X_1^(2) -
X_2^(2) - X_3^(2).

Suppose there are two events. One occurs at the space-time
coordinate X. The second event occurs at a
new point, let me call it X_bar,
whose time coordinate X_0bar,
whose position coordinate, this is unfortunate,
vector r with a bar on it.

These are like two vectors in the x,y plane.
You and I, if we are going relative to each other,
will not agree on the components of this or the
components of that. But we will agree on
X∙X'. In other words,
X∙X will be the same as X'∙X_bar'.

What I'm saying is, if we go to a moving frame of
reference, the components of the vector X goes into
X', whose components are
X_0' and some r'.
And likewise, the vector
X_bar has new components;
X_bar', which are
X_0bar' and r_bar'.
Look, it's a story as the same like
in two dimensions. Components of vectors are
arbitrary. They vary with the frame of
reference. But the dot product of vectors
is the same no matter how you rotate your axis,
because the dot product is length of A times length
of B times cosine of the angle between them.
None of the three things changes when you rotate your
axis. Well, this is the analogue of
the dot product in space-time because this is the guy that's
same for everybody. You can say,
"Why don't you study this combination?"
It's so nice. All pluses.
I like that. I don't study that because that
combination is not special. If I have one value for the
combination, you will have a different value for the
combination. There's nothing privileged.
It's the one with the minus signs in it that has the
privileged role of playing the role of the dot product.
So, space-time is not Euclidean. Euclidean space is the space in
which we live, and distance squared is the sum
of the squares of all the coordinates.
So, it's a pseudo-Euclidean space in which to find the
invariant length, you've got to square some
components and subtract from them the square of some other

Okay. So, finally,
if you take a difference vector, in other words,
take the difference of two events,
then the spatial coordinate according to me,
the difference in space coordinate,
namely, cΔt^(2) - Δx^(2) will be cΔt' -
Δx'^(2). Or if you like,
X_0^(2) - X_1^(2) will be
X_0'^(2) - X_1'^(2).
I'm using the notations back and forth so you get used to
writing the space-time coordinates in two different
ways. So, it not only works for
coordinates, but for coordinate differences.
You understand? Two events occur,
one here now and one there later;
let's say they are separated in time by two meters.
They are separated in space by two meters and time by 11
seconds. I'm sorry.
I didn't mean to write this. I meant to write
ΔX_0^(2) - ΔX_1^(2) =
ΔX_0'^(2) - ΔX_1'^(2).

So, do you guys follow the meaning of this statement?
Most things are relative. Distance between events --
relative. Time interval between events --
relative. But this combination is somehow
invariant. Everybody agrees on what it is.

Now, we are going to understand the space-time interval.

Δs^(2) is a name for a small or a space-time interval
formed under differences. Now, we are going to apply this
to the following problem. I'm going to give you a feeling
for what space-time interval means when applied to the study
of a single particle. So previously,
Δx and Δt were separations between two random,
unrelated events or arbitrary events.
But now I want you to consider the following event.
Here is a particle in space-time.
And it moves there. This is where it is in the
beginning, that's where it is at the end.
So now, I want Δx to be the distance the particle

And I want Δt to be the time in which it did this.

So, these are two events in the life of a particle.
Two events lying on the trajectory of a particle.
In the x,t plane, you draw a line,
it's a trajectory, because at every time,
there's an x, and that describes particle
motion. And this is the Δx and
this is the Δt. Let's look at the space-time
interval between the two events. Δs^(2) will be
cΔt^(2) - Δx^(2).

But we're going to rewrite this as follows.
Equal to cCΔt^(2)[1 - (v^(2)/c^(2))] where
v is Δx/Δt.

Because these are two events in the life of a particle,
the distance over time is actually the velocity of the

I have always used v for the velocity of a particle that
I'm looking at, and u for the velocity
difference between my frame of reference and your frame of
reference. So, u is always the
speed between frames. And v is the speed of
some particle I'm looking at. So, let's take the square root
of both sides, and we find Δs =
cΔt√[1 - (v^(2)/c^(2))].

So, the space-time interval between two events in the life
of a particle is c times the time difference times this
factor. Now, this is supposed to be an
invariant. By that I mean no matter who
calculates this space-time interval, that person is going
to get the same answer. So, let's calculate the
space-time interval as computed by the particle itself.
What does the particle think happens?
Well, the particle says, in other words,
you're riding with the particle.
So, in your frame of reference, for the particle,
now, the time difference between the two events--let me
call this one Δτ. And the space difference?

What's the space difference between the two events?
Yeah? Student:
[inaudible] Professor Ramamurti
Shankar: Zero. Because as far as the particle
is concerned, I'm still here.
If I'm moving, from my vantage,
my x coordinates, wherever I chose to put in the
beginning, and that's where it'll follow me.
So the two events, particle sighted here and
particle sighted there, have different x
coordinates for the person to whom the particle is moving.
But if you're co-moving with the particle,
then your x coordinate as seen by you does not change.
Therefore, Δx is zero. Therefore, the space-time
interval would be simply cΔτ where τ is the
time measured by a clock going with the particle.
So, Δτ is the time in particle's frame.
In other words, if the particle had its own
clock, that's the time it will say has elapsed between the two
points in its trajectory. So, it's not so hard to
understand. See, I am the particle.
I'm moving, looking at my watch, and I'm saying that time
will pass for me, right?
One second and two second; that's the time according to me.
If you guys see me, you will disagree with me on
how far I moved and how much time it took.
That's your Δx and your Δt.
But for me, there's only Δτ.
There's no Δx for me. So, the space-time interval,
when you describe particle behavior, is essentially the
time elapsed according to the particle.
So, it's not hard to understand why everybody agrees on that.
See, you and I don't have to agree on how much time elapsed
between when the particle was here and when it was there.
But if we ask, how much time elapsed according
to the particle, we are all asking the same
question. That's why we all get the same
answer. The space-time interval is
called the proper time. So, proper time is another name
for the time as measured by a clock carried with the particle.
S,o let's write it this way. The space time interval is
really cΔτ. And Δτ is also an
invariant. Namely, everyone agrees on

And what's the relation between Δt and Δτ?
Δt is the time according to any old person.
And Δτ is the time according to the clock.
And you saw that the space-time interval, cΔτ was equal
to Δt√[1 - (v^(2)/c^(2))].
So, we are going to--I'm sorry, the c cancels on both
sides. This is the relation.

So, I will be using the fact that dτ/dt = √[1 -
(v^(2)/c^(2))] or dt/dτ = 1/√[1 -
(v^(2)/c^(2))]. In other words,
the time elapsed between two events in the life of a
particle, as seen by an observer for whom
it has a velocity, compared to as seen by a clock
moving with the particle, that ratio is given by this.
As the particle speeds up, let's see if it makes sense.
As v approaches c, this number approaches
zero. That means you can say it
took--the particle has been traveling for 30 hours,
but with this number almost vanishing, the particle would
say, I've been traveling for a very short time.
That's the way it is. The particle will always think
it took less time to go from here to there,
compared to any other observer, because the clock runs fastest
in its own rest rate.

Okay. Now, this is the new variable
I'm going to use to develop the next step.
The next step is the following; in Newtonian mechanics,
particles have a position x and maybe y and
z. But let me just say x.
Let me take one more. It had x and it had
y. And these were varying with

Then, I formed a vector r, which is ix +
jy. And the vector is
mathematically defined as an entity with two components so
that when you rotate the axis, the components go into
x' and y', which are related to x
and y, but they're cos and
sin θ. That's defined to be a vector.
Now, if I went to you and said, okay, that's one vector,
the position vector. Can you point to me another
vector? Anybody know other vectors in
Newtonian mechanics? Yes sir?
You don't know any other vectors in the good old
mechanics days? The only vector you've seen is
position? Student: Velocity is one.
Professor Ramamurti Shankar: Very good.
Yeah. That's right.
You're right. You certainly know the answers.
So, you shouldn't hesitate. So, how do you get to velocity
from position? Student:
Take the derivative. Professor Ramamurti
Shankar: You take the derivative.
So, you've got to ask yourself; why does the act of taking
derivative off a vector produce another vector?
Well, what's the derivative? You change the guy by some
Δ, and you divide it by the time.
Now, change in the vector is obviously a vector,
because difference of two vectors is a vector.
Dividing by time is like multiplying by the reciprocal of
the time. That's like multiplying by
number. And I've told you multiplying a
vector by a number also gives you a vector;
maybe longer or shorter. Therefore, Δr is a
vector because it's the difference of r later
minus r now, dividing by Δt is the
same as multiplying by 10,000 or 100,000 or 1 million.
It doesn't matter. That's also a vector.
And the limit is also a vector. Therefore, when you take a
derivative of a vector, you get a vector.

And then you can--Once you got this derivative,
it becomes addictive. You can take second derivatives.
As you said, you can have acceleration.
Then you can get--you can take the acceleration,
multiply a mass by a mass. Now, that is called a scalar.
The mass of particle is a number that everyone will agree
on. It has no direction.
So, the product of a number and a vector is another vector,
and that vector, of course, is the force.
So derivatives of vectors and multiples of vectors by scalars,
namely things that don't change when you rotate your axis,
are ways to generate vectors. So what I want to do is,
I want to generate more vectors.
The only four-vector I have is the position four-vector,
which is this guy, X,
whose components I write for you again are
X_0, which is code name for
ct and X_1,
which is code name for x. And you can put the other
components if you like.

Now, take this X to be the coordinate in space time of
an object that's moving. I want to take the derivative
of that to get myself something I could call the velocity vector
in relativity. But the derivative cannot be
the time derivative. You guys have to understand
that if I take these things--Of course in space-time,
the particle is moving. I can certainly tell you where
it is at one time and where it is a little later.
And I can take time derivative. But the time derivative of a
vector in fourth dimension is not a vector,
because time is like any other component, you know?
It's like taking the y derivative of x. That
doesn't give you a vector. You've got to take a derivative
with respect to somebody that does not transform,
that does not change from one observer to the other.
So, do you have any idea where I'm going with this?
Yes? Student: Space-time.
Professor Ramamurti Shankar: Yes.
Or you can take derivative with respect to the time as measured
by the clock which is moving with the particle.
So, you can take the τ derivative.
In other words, let the particle move by some
amount ΔX, namely ΔX_0,
ΔX_1, ΔX_2,
etcetera. That difference will also
transform like a vector. We have seen many,
many times differences as coordinates and they transform
like a vector. Divide it by this guy,
now, which is the time according to a clock moving with
the particle. Everybody agrees on that,
because we're not asking how much time elapsed according to
you or according to me. We are going to quarrel about
that indefinitely. We are asking,
how much time passed according to the particle itself?
So, no matter who computes the time, you will get the same
answer, and that's what you want to divide it by.
So, I'm going to form a new quantity called velocity,
which is the derivative of this with respect to τ.

And I'm going to use the chain rule and write it as
dx/dt(dt/dτ). That becomes then 1/[1
- (v^(2)/c^(2))]dX _0/dtdx
_1/dtdx _2/dt… So,
you find the rate of change as measured by a clock moving with
the particle. And that has the virtue that
what you get out of this process will also be a four-vector.
By that I mean, its four components will
transform when you go to a moving frame just like the four
components of X. You remember when you took
x and y and you rotated the axis,
x' is x cos τ plus y sin and so on.
But if you take time derivatives, you'll find the
x component of the velocity,
in the rotated frame related to the x and y in the
old frame by the same cosines and the same sines,
because the act of taking derivatives doesn't change the
way the object transforms. So, this is my new four-vector.
In fact, let me put the third guy too;
dX_3/dt. So, what I did was I took the
τ derivative for which we don't have good intuition and
ordered in terms of a t derivative,
for which we have a good intuition.
Because no matter how much Einstein tells you about space
and time, we think of time as different from space.
So, we'll think of time in terms of derivatives.
But to form a four-dimensional vector, it's not enough to do
that. That's what I'm saying.
Every term there is divided by this, because that's the way of
rewriting d by dτ as d by dt times
this factor. So now, I am ready to define
what I'm going to call momentum in relativity.
It's going to be called the four-momentum.
Everything is the four something.
The four-vector position, this is called the

Now, I'm going to define something called the

The four-momentum is going to be the mass of a particle when
it's sitting at rest. Multiply it by this velocity.

So, what is it going to be? Let me write it out.
What is dX_0/dt?
You guys remember X_0 = ct.
Sorry? Student:
[inaudible] Professor Ramamurti
Shankar: X_0 = ct and
dX_0/dt will be equal to c.
You're right. If that's what you were saying,
I agree with you. Yes.
So, this is a vector, m_0c/[1 -
(v^(2)/c^(2))]. Then the other guy,
other component, let me write simply as
m_0 times the familiar velocity times
[1 - (v^(2)/c^(2))].
If you wanted to keep the x, y, and z velocities,
you can keep them as a vector. Or, if you just wanted to live
in two dimensions, one space and one time,
drop the arrow. So, we have manufactured now a new
beast. It's got four components.
What is it? I modeled it after the old
momentum. I took the mass and I
multiplied it by what's going to pass for velocity in my new
world. And I got this creature with
four parts. You've got to understand what
the four parts mean. It's got a part that looks like
an ordinary vector. And it's got a part,
there's no vectors in it. Completely an analogy with the
fact that X had a part that looked like a vector in the
three spatial components and a part that in the old days was
called a scalar, because it didn't transform.
But of course, in the new world,
everything got mixed up. We got to know who this is and
we have to know who this is. So, I want to show you what
they are.

So, here is my new four-vector; m_0c/[1 -
(v^(2)/c^(2)). And m_0v,
let me drop the other components except this in the
x direction. I may not worry about that now.
I wrote that because I want to be able to call it a
four-vector. It's crazy to call this a
four-vector. I should call it a two-vector.
You guys have already seen two vectors.
Well, you have seen two-vectors in the x,y plane.
This is a two-vector in space-time.
Or, it's part of this big four-component object.
It's a four-vector in four dimensional space-time.
So, we have created something, and we're trying to understand
what we have created. So, let me look at this guy
first. This guy looks like
m_0v divided by this factor.
I don't know what it stands for, but I say,
well, relativistic physics should reduce the Newtonian
physics when I look at slowly moving objects because we know
Newton was perfectly right when he studied slowly moving
objects. If I go to a slowly moving
object, so v/c becomes negligible, I drop that,
and that becomes mv; I know that's the old momentum.
So, this guy is just the old momentum properly corrected for
relativistic theory. So, this quantity here deserves
to be called momentum p; you can put an arrow on it,
if you want, to keep the p component
or don't put an arrow so that of the four vector--the three
components are just momentum, but not defined in the old way.
It's not m_0v. It's m_0
divided by this. So, the momentum of a particle
is very interesting in relativity, even though nothing
can go faster than light, that doesn't mean the momentum
has an upper limit at m_0c,
because the momentum is not mass times velocity;
it's mass divided by this crazy factor.
So, as you approach the speed of light, as v approaches
c, the denominator is very close to zero,
this number can become as big as you like.
So, why are people building bigger and bigger accelerators?
If you ask them how fast is a particle moving,
for everybody in Fermilab, in Surin it's all close to the
velocity of light. It's 99.9999,
other person has a few more nines.
Nothing is impressive when you look at velocity.
But when you look at the momentum, it makes a big
difference how much of the one you have subtracted on the
bottom. If you've only got a difference
in the 19th decimal place, well, you've got a huge
momentum. So, particles have limited
velocity in relativity, but unlimited momentum.
It also means, to speed up this particle is
going to take more and more force as it picks up speed.
In other words, you'll be pushing it like
crazy. It won't pick up speed,
but its momentum will be going up.
It'll pick up speed, but this pickup in speed will
be imperceptible because in the 19th decimal place of
v/c, it'll go from 99999 to
something near the end. The last digit will change.
But the momentum will change a lot.
But I don't want to stop before looking at this guy.
So, we don't know who this is. So, we take this thing,
this is a zero component of p.
Just like I called it X_0 in the old
days, I'm going to call it P_0.
P_0 was m_0c/[1 -
(v^(2)/c^(2))]. So, I don't know what to make
of this guy either. If I put v = 0,
I get m_0c. That looks like the mass of a
particle, but I have no idea. Okay, so it's the mass of a
particle. See here, when I put v =
0 on the bottom, because the top had a v,
not v/c, something was left over that
looked like something familiar, namely momentum.
Here, if I take v = 0 too quickly, what's left over is
nothing reminiscent of anything in Newtonian mechanics.
So, you want to go in what's called the next order in
v/c. You don't want to totally
ignore it. You want to keep the first
non-zero term, so we write it as
m_0c[1 - (v^(2)/c^(2))]^(-1/2). I've
just written it with a thing upstairs for the - ½ as the
exponent. Then, I remind you of the good
old formula: 1 + x^(n) = 1 + nx plus--if
x is very small. So, if you use that,
you get m_0c[1 + (v^(2)/2c^(2)) plus
higher powers of v^(2)/c^(2). So,
what are these terms coming out? Well, there's the first term,
for which I have absolutely no intuition.
The second term looks like ½ (mv)^(2)/c.

So, for the first time, I see something familiar.
I see the good old kinetic energy, but not quite,
because I have this number c here.
So, I decided maybe I shouldn't look at P_0,
but I should look at cP_0.
Maybe that looks something more familiar.
That looks like m_0c^(2) + ½
(mv)^(2) plus more and more terms depending on higher
and higher powers of v^(2)/c^(2).
But now we know who this guy is. This is what we used to call
the kinetic energy of a particle by virtue of its motion.
As the particle picks up speed, the kinetic energy itself
receives more connections because the other terms with
more and more powers of v/c are not negligible.
You've got to put them back in. Basically, you have to compute
this object exactly. But at low velocities,
this is the main term. So, this is what led Einstein
and mainly him to realize that this quantity is talking about
the energy of an object. This is certainly energy
recognized. I can put it to good use, right?
You run windmills and so on with kinetic energy,
or hydroelectric power is from channeling kinetic energy of
water into work. So, you figure this is also
part of the energy. So, the first conclusion of
Einstein was, even a particle that's
non-moving seems to have an energy, and that's called the
rest energy. If the particle is moving,
then go back to the derivation and you can find
cP_0 = m_0c^(2) over this
thing. This is the full expression for
the energy of a moving particle. Not the approximate one,
but if you keep the whole thing, this is what you get.
So, this says when a particle moves, it's got an energy that
looks like some velocity depending on mc^(2).
And when the particle is at rest, it has got that energy,
and that's the origin of the big formula relating energy and
mass. So, it leaves open the
possibility that even a particle at rest;
you think when a particle is at rest, you have squeezed
everything you can get out of the particle,
right? What more can it do for you?
It gave it all it's got and it's stopped.
But according to Einstein, there is a lot of fun left.
We can do something more if there's a way to destroy this.
So, in his theory, it doesn't tell you how you
should do that. But it did point out that if
you can get rid of the mass, because it was energy,
some other form of energy must take its place by the law of
conservation of energy. So, that's how all the nuclear
reactions work. In nuclear reactions,
you can take two parts, like two hydrogen atoms,
hydrogen nuclei. You can fuse them into helium
and you will find the helium's mass is less than the mass of
the two parts. So, there is some energy
missing. Some mass missing;
that means there's some energy is missing.
That's the energy released in fusion.
Or you can take uranium nucleus and give it a tap and break it
up into barium and krypton and some neutrons.
And you will find the fragments together have a mass less than
the paired. And the missing mass times
c^(2) is the energy of the reaction that comes out in
the kinetic energy of the fast moving particles.
So, Einstein is wrongly called the father of the bomb,
because this is as far as he went.
Didn't specify how to extract energy from mass,
but showed very clearly that rest energy--that a particle at
rest seems to have a term that should be called energy because
this guy certainly is. Yes?
Student: [inaudible]
Professor Ramamurti Shankar: Will be 3/8
m_0 (v^(4)/c^(4)).
I don't carry more terms in my head, but frankly,
you won't need it. Student:
[inaudible] Professor Ramamurti
Shankar: There are more corrections to kinetic energy.
In other words, why is this an important
concept? If two particles collide,
we believe energy is conserved. So, you want to add the energy
before and compare it to energy after.
If you want them to perfectly match, if you stop here,
your book keeping will not work.
Before will not equal after. You've got to keep all this
infinite number of terms to get it exactly right.
But if you're satisfied to one part in a million or something,
maybe that's as many terms as are needed.
And at very low velocities, you don't even need that
because this is never changing in a collision.
Every particle has a rest mass. This is what we balanced when
we collided. Remember, we collided blocks
and we balanced the kinetic energy?
Actually, that energy is there in every block,
but it is not changing. It is canceling out in the
before and after. So, unless the two blocks
annihilated and disappeared, then you would really have
something interesting. But you don't have that in the
Newtonian physics. Okay.
So, I should tell you that I've given you guys two more
problems, one of which is optional.
You don't have to do that, but it's a very interesting
problem. And I want you to think about
it because it's a problem whether there's two rockets
headed towards each other, both of the same length;
say one meter. And this guy thinks he is at
rest, and the second rocket is zooming past him like this.
When the tail of this rocket passes the tip of this,
he sends out a little torpedo aimed at this one.
Clearly, it doesn't do any harm. But look at it from the point
of view of this person. This person says,
"Well, I'm a big rocket. I'm moving like this,
and when my tail met the tip of this rocket, he sent out an
explosive, which certainly hit me down here."
So, the question is, "Does this upper rocket get
attacked?" Does it get hit by the lower
one or not? From one point of view,
you miss. The other point of view,
when you expand this one and shrink this one,
you definitely got hit. So, you've got to ask,
I mean, do you get hit or don't get hit?
You cannot have two answers to one question.
So, I've given the problem. I've given you a lot of terms,
and I think even though it's optional, if you do that
problem, you've got nothing to fear.
Okay? It'll give you good practice in
how to think clearly about any number of these problems.
And I've given you enough hints on how to do that.