Need for the General Theory

GENERAL RELATIVITY

The Origin of the Special and General Theories

Newtonian Mechanics and Inertial Frames of Reference

Newtonian particle mechanics is based on Newton's laws of motion. Newton's first law may be written: "Reference frames exist in which all free particles have zero acceleration". Here a free particle is defined to be one on which no net force acts. It is assumed that the question of whether or not a particle is free is absolute and does not depend on the choice of frame in which the motion is expressed.

Originally Newton presumed the existence of a unique reference frame called "absolute space" with respect to which free particles would have zero acceleration. However the additional assumption of "absolute time" meant that any frame moving with uniform velocity with respect to absolute space would be dynamically equivalent to the latter. Accordingly in specifying Newton's laws we normally refer to a set of reference frames, each of which has uniform velocity with respect to any other. These are the so-called inertial frames - in practice, those moving with uniform velocity with respect to the "fixed" stars. (Why the latter should be involved is the subject of the so-called "Mach principle" - that distant matter in the universe determines by some means the inertia effects which we observe.)

Newton's second law may be written: "The acceleration of a particle with respect to any inertial frame is proportional to the force acting on it", or

(1)

where the constant of proportionality m is called the inertial mass of the particle. We assume here that the force F is independent of the reference frame in which the motion of the particle is expressed. Experimentally equation (1) and its immediate consequences are found to be correct for particles moving with everyday speeds, i.e. small compared with the velocity of light.

Newton's third law is that the forces exerted on each other by two interacting particles are equal and opposite. This proposition is comparatively easy to verify for simple mechanical systems, but it is untrue in electrodynamics, e.g. for the forces between two charged particles in relative motion.

The Origins of the Special Theory of Relativity

As mentioned above, inertial frames all have a constant relative velocity with respect to one another. Mathematically this is in accordance with the so-called Galilean transformation equations, which state in pre-relativistic physics the presumed relation between the spacetime coordinates of an event (x, y, z, t ) in one inertial frame and the corresponding coordinates (x', y', z', t') in another. For this purpose we usually imagine the two frames of reference to be in "standard configuration", with the cartesian axes coinciding at t = t' = 0 and the relative motion being in the common x, x' direction, as shown in the figure.

Figure 1. Inertial frames S and S' in "standard configuration".

Since the distance between the origins is just vt, it is apparently "obvious" that the coordinate transformation equations are of the form

x' = x - vt, y' = y, z' = z, t' = t (2)

the fourth of these expressing the Newtonian belief that the time of an event is the same in all frames of reference, i.e. "time" is absolute. These are the Galilean transformation equations, and it is an elementary exercise now to verify that the components of acceleration of any object with respect to the two coordinate systems of reference are the same. So the acceleration in equation (1) will have the same value in any inertial frame.

Since Newton's laws of motion are the basis for all of particle mechanics, later extended to continuum mechanics (rigid bodies etc.) it follows that the whole of classical mechanics "works" in all inertial frames of reference. For example, if the momentum of a physical system is conserved in one inertial frame of reference (which will be the case if there are no external forces acting on it) then it follows that the same result will hold in any other inertial frame.

It is of interest to enquire if this principle also holds in other branches of physics, e.g. does electromagnetism apply equally in all inertial frames of reference? Electromagnetism is based on Maxwell's equations, which are sophisticated ways of expressing, in the form of differential equations , more familiar equations such as Coulomb's law, the Biot-Savart law, etc. It can be shown that, in the absence of matter, Maxwell's equations combine to give a wave equation, describing electromagnetic waves, the velocity of such waves being given by the formula .When numerical values are inserted, the result is

c = 2.998 x108 m s-1, the same as the experimental value of the velocity of light. This was a great triumph for electromagnetic theory when it was first discovered, since it identified light as a form of electromagnetic wave, with wavelengths in a particular (visible) range. However one drawback, which proved very troublesome for late 19th century physics, was this: any wave travelling at a speed of exactly c in one inertial frame must surely have a different value in other inertial frames; we have after all from (2) that the velocity components dx/dt and dx'/dt' in the frames S and S' respectively must differ by v. So electromagnetism, unlike mechanics, appears to single out one inertial frame in particular - that in which the velocity of light is exactly c for light travelling in any direction. For understandable reasons, pre-relativity physicists assumed that this preferred frame was that in which the "medium" for light propagation was at rest, by analogy with the propagation of sound. This all-pervading medium was referred to as the "ether" and assumed to be endowed with specific physical properties, even although in empty space it seemed to consist of nothing at all.

The experimental search for the ether took place in the late 19th century, but ultimately proved to be a blind alley. Repeated attempts (see textbook accounts of the Michelson-Morley and other experiments) to measure the velocity of the earth with respect to the ether frame ended in failure, and likewise various ingenious attempts to account in an ad hoc way for those failures.

Einstein's Special Theory of Relativity

Einstein started from the belief that physics demonstrates an essential unity - there are no rigid boundaries between its various disciplines. So electromagnetism should be on the same footing as mechanics in the sense of being valid in all inertial frames of reference. He realised also that the Galilean transformation equations (2), the origin of the contradiction just discussed, are not self-evidently true, but are fallible assertions about the results of hypothetical physical experiments. It is possible therefore that they could be wrong, in spite of the fact that they seem so "obvious" and work so well in everyday situations.

In 1905, Einstein put forward the following two postulates:

1. The laws of physics should have the same form in all inertial frames of reference.

2. Electromagnetic signals in vacuo travel at speed c with respect to all inertial reference frames.

Note that the first postulate confirms the special role of inertial frames of reference in physics. It implies that any proposed law of physics should satisfy the theoretical test of "covariance"; i.e. when we have expressed it in mathematical form in one inertial frame, transforming the coordinates to a different inertial frame should leave the form of the equation unaltered.

The second postulate reveals the velocity of light as one of the few fundamental constants, on the same footing as Plank's constant and the electronic charge. But it also compels us to question the validity of the Galilean transformation equations, since they are clearly incompatible with the postulate that the velocity of light never changes. Recognising however that the Galilean equations are experimentally correct for small velocities, we search for a more general set of transformation equations which will (we guess) approximate to the Galilean equations in some appropriate limit.

The Lorentz transformation equations

These equations describe how the coordinates (x, y, z, t) of an event in one inertial frame S are connected to the corresponding coordinates (x', y', z', t') in another frame S'. In the simplest case we set up the two frames of reference as shown in figure 1, the "standard configuration" of S and S', with relative velocity v between the frames in the common x, x' direction.

(The actual assignment of coordinates to events in a given inertial frame is straightforward, assuming standard measuring rods and clocks distributed throughout the frame. In particular, each clock at a distance from the origin is synchronised with all others by setting it to read

t = /c when a light signal, sent out from the origin at t = 0, arrives at the clock. Then the time of any event is the time currently shown on the clock beside which the event occurs.)

The arguments which lead to the new "Lorentz transformation equations" include, as well as Einstein's postulates, some basic assumptions about the nature of space and time. We assume for example that there are no preferred directions in space, and that the choice of the origins of the coordinate systems is of no fundamental importance. Taken together with the first postulate - that all inertial reference frames should be on the same footing - it can be shown that the transformation equations must be linear, and are restricted to the mathematical form

y' = y, z' = z, (3)

where k is a constant with the dimensions of velocity. At this point we note that the Galilean transformation equations (2) are obtained by making the erroneous assumption that k = ∞.

It is Einstein's second postulate which determines the correct value of k. We imagine a light pulse emitted at t = 0 from the origin of the inertial frame S, and which at any subsequent time will be of spherical shape described by the equation

(4)

The second postulate requires that the pulse should be described in frame S' by an equation of exactly the same form, i.e.

. (5)

It is necessary therefore that either equation should imply the other; and this is achieved by taking k = c, for then we have (as is easy to verify)

. (6)

Hence the final result is

, y' = y, z' = z, (7)

which are the Lorentz transformation equations. The Galilean transformation equations are an approximation to these, valid in the "non-relativistic limit" v/c « 1.

Some consequences of the Lorentz transformation equations

1. Reversal of the roles of the two frames. We took S to be the frame at rest and S' to be moving in the positive x direction with respect to S with speed v. But this initial choice was arbitrary, and we could just as well have taken S' to be at rest, with S moving in the negative x' direction with respect to S' with speed v. The equivalence of these two descriptions is shown mathematically by solving equations (7) for the coordinates x, y, z, t in terms of the primed coordinates. We find

, y = y', z = z', (8)

which are of the same mathematical form but with the primed and non-primed coordinates interchanged and v replaced by -v.

2. Time dilation. Consider two events which occur at the same place in one of the two frames - say the frame S'. Denoting the coordinates of the events (x'1, y'1, z'1, t'1) and (x'2, y'2, z'2, t'2) in frame S' (so that x'1 = x'2, y'1 = y'2, z'1 = z'2) and (x1, y1, z1, t1) and (x2, y2, z2, t2) in frame S, application of the fourth of the transformation equations (8) to both events followed by subtraction yields the result

(9)

This equation shows that the time interval between the two events does not take the same value in all frames of reference; the interval in S is extended by a factor (1-v2/c2)-1/2 compared with the interval in S'. We call this time dilation. The quantity t2'-t1' is called the proper time interval between the events, and is the minimum time interval which would be ascribed to the two events in any inertial reference frame.

This result is often expressed by the statement: "moving clocks go slow". This refers to a hypothetical attempt by observers in one frame of reference - the frame S in this case - to ascertain the rate of a standard clock which is at rest in the moving frame S'. To do this the observers in S note two particular ticks of the moving clock and use these as the two events whose space and time coordinates have just been described. They are bound to find, if the theory is correct, that the moving clock has registered a smaller time interval than is recorded by clocks in their own frame, so their conclusion is that, compared with their own clocks (which naturally are assumed to be "correct"), the moving clock is going slow.

It is only apparently anomalous that the same conclusion, but in reverse, would be reached by observers in the frame S' who make observations on a clock at rest in S. Their conclusion, in other words, is that

, (10)

the same equation as (9) but with the primed and unprimed coordinates interchanged. There is no contradiction between these equations, since the two physical situations are different. In the first case, the two events which are being used for purposes of comparing time intervals - the two ticks of the clock in S' - occur at the same place in frame S', and in the second, they are at the same place in frame S.

3. The velocity of light as a limiting velocity. The factor (1-v2/c2)-1/2 becomes imaginary for values of v exceeding c, so we see that no frame of reference can have a value of v in this range. Furthermore, since frames of reference may be constructed from material objects, it follows that no particle may have a velocity exceeding c either.

4.The Doppler effect. Like the Doppler effect in acoustics, this refers to the shift in frequency when a wave impinges on two observers who are in relative motion with respect to another. In its simplest version, we will consider an electromagnetic wave approaching from x = +∞ two observers located at the origins of the two reference frames S and S'. Given that the two observers record frequencies n and n' respectively, it can be shown that these are related by the equation