cphil_sr

 

SPECIAL THEORY OF RELATIVITY

by Ray Shelton

 

INTRODUCTION

A problem with Maxwell’s Theory of the Electromagnetic Fields led to Einstein’s Special Theory of Relativity. The following is a discussion of that problem and its solution by Einstein’s Special Theory of Relativity.

 

TABLE OF CONTENTS

 


THE PROBLEM

In 1864 James Clerk Maxwell (1831-1879) presented to the Royal Society his famous paper “On a Dynamical Theory of the Electromagnetic Field,” in which he set forth his four, now famous, equations describing all electromagnetic radiation. These equations described exactly how electromagnetic forces worked and along with Newton’s Laws of Motion and of Universal Gravitation seemed to provide a complete understanding of the physical world. Maxwell’s equations not only described electromagnetic phenomena but it also predicted that there should be electromagnetic waves that travel at the speed of light. The prediction of the existence of electromagnetic waves was confirmed on Friday, the thirteenth of November, 1888, by Heinrich Hertz (1857-1894), who detected electromagnetic waves emitted by an electric spark. These waves were called Hertzian waves, which we now call radio waves. He also showed that electromagnetic waves of short wavelength are refracted more than those of long wavelength in passing through matter, just as Newton had shown in his prism experiment for light. He showed that these electromagnetic waves could be defracted and they also showed interference. There was no difference, except wavelength and frequency, between these invisible long electromagnetic waves made by an electric spark and the light seen by our eyes. In 1894, a young Italian, Gugliehmo Marconi, who was only twenty years old, read of Hertz’ work and got the idea of using electromagnetic waves for communications. In 1898 he accomplished the transmission and reception of the Hertzian waves over a distance of a few miles and in 1901 he sent and received them successfully across the Atlantic.

Maxwell was the first to understand that light, electricity, and magnetism are intimately related and that light itself was electromagnetic waves of extremely high frequency and short wavelength. Noting that the ratio of the two constants ke/km connecting electricity and magnetism, which had already been measured in the laboratory, was exactly the square of speed of light, he concluded that light is an electromagnetic waves. That is,


k
e/km = (9 × 109 newton·meter2/coul2) / (10-7 newton·sec2/coul2).


He calculated the value of that ratio and found it to be 9 × 1016 meters2/sec2, the square of the speed of light. But in this fact lay the seed of a great problem. Consider sound waves that travel through the air at about 1000 feet per second. Now if you, as an observer, are traveling in the opposite direction to the direction of the sound at 100 feet per second, the sound would be moving relative to you at 1100 feet per second. But Maxwell’s equations indicated that light as electromagnetic waves would be measured to have the same speed for any observer, however he was moving. This inconsistency was the result of considering that light waves are like sound waves in that the waves require a medium for their propagation. This medium for light waves was called “aether”. According to nineteenth century physics, aether had to have unusual properties. It had to be immensely rigid to give light a speed of 3 × 108 m/sec and at the same time to have a zero density so that bodies like the planets could travel through it. Also it had to be perfectly transparent to account for its undectability. Later this attempt to attribute to the aether properties analogous to the properties of matter was abandoned. The sole property of aether was to support electromagnetic waves. To nineteenth century physicists it was difficult to conceive of waves as traveling without a medium. Many physicists concluded that Maxwell’s theory was valid only if the observer was at rest relative to the aether. But then, what observers are at rest relative to the aether? Maxwell’s theory did not provided this knowledge and without this knowledge the theory seemed incomplete. Was the aether at rest relative to the earth? No one wanted to accept this view that the earth was the center of aether universe. The common accepted view was that the aether was at rest with respect to the fixed stars. This meant that the earth was moving relative to the aether, in different directions at different times as it moved along its orbit around the sun. That is, the speed of light should vary when measured at different points along its orbit. Measuring this effect became the crucial experiment during the late nineteenth century.

 

The Michelson-Morely Experiment

The most accurate attempt to measure the speed of light at different points along the earth’s orbit was made by two American physicists, Albert A. Michelson (1852-1931) and Edward W. Morley (1838-1923) in 1887. Michelson first performed the experiment in 1881, using a instrument that he had invented, called an optical interferometer, and then, in 1887, with the collaboration of his colleague Morley, carried out a more precise version of what is known as the Michelson-Morley experiment. An optical interferometer is a ingenious device that splits a light beam into two parts so that they would travel over two different paths and then after they were reflected back along their paths the device recombines them to form an interference pattern. The Michelson interferometer being fixed with respect to the earth is moving with the earth through the aether that was consider to be at rest to the fixed stars and the sun. Hence the earth and the interferometer moves through the aether at a speed of about 30 km/sec, in different directions during different seasons of the year. The Michelson interferometer attempted to measure the speed of light relative to the aether. It was assumed that aether filled all of space and that it was the medium with respect to which the speed of light was to be measured. It followed then that an observer moving through the aether with velocity v would measure a velocity c′ of a light beam since c′ = c + v. It was this result that the Michelson-Morley experiment was designed to detect.

The Michelson interferometer had a monochromatic light source S whose light beam was aimed at a partially silvered mirror M inclined at 45° to the direction of the beam. The glass of this mirror was lightly coated with silver so that approximately half the light would be reflected and half would pass through the glass of the mirror. The beam of light is thus split by the partially silvered mirror M into two coherent beams, one passing through the mirror in the direction of the original beam, and the other reflected off the partially silvered mirror M at right angles to the direction of the original beam. Beam 1 is transmitted through M and is reflected back to M by a plane mirror M1. Beam 2 is reflected at right angle off M and reflected back to M by a second plane mirror M2. Thus both beams are reflected off mirrors at the end of the beams’ path back toward the partially silvered mirror where the beams are partially reflected and partially transmitted through the mirror M. There they interfere with each other and the interference pattern is observed by a telescope at right angle to the original beam.

Suppose that one of the arms of interferometer is aligned along the direction of the motion of the earth through space. The earth moving through the aether would be equivalent to the aether flowing past the earth in the opposite direction. This “aether wind” blowing in the direction opposite the earth’s motion should cause the speed of light measured in the earth’s frame of reference to be cv as the light approaches the mirror M1 at the end of arm in the direction of its motion and c + v after reflection, where c is the speed of light in the aether frame of reference and v is the speed of the earth through space and hence the speed of the aether wind. The incident and reflected beams of light recombine, and an interference pattern consisting of alternate dark and bright bands would be formed.

The interference is constructive or destructive depending on the phase difference of the two beams. The phase difference can arise from two causes: the difference path lengths L1 and L2, and the different speeds of travel along the two paths with respect to the aether wind. To compensate for the first cause the path lengths L1 and L2 are made as nearly equal as is possible. The second cause is crucial one. This can explained by means of commonly used analogy. The different speeds along the two arms of the interferometer are much like the different cross-stream speed and up-and-down-stream speeds with respect to the shore of a row boat in a moving stream. The time t1 for beam 1 to travel from M to M1 and back is


t1 = L1/(cv) + L1/(c + v) = 2L1/c [1/(1 – v2/c2)],


where the light, whose speed is c in the aether, has an up-stream speed of cv with respect to the device and a down-stream speed of c + v.  The path of the reflected beam 2, traveling from M to M2 and back, is the cross-stream path through the aether. Meanwhile the device has moved at right angles to path L2 through a distance d = vt2. Therefore, the reflected light beam 2 travels diagonally from M to M2 as the device moves a distance vt2/2 through the aether. By the Pythagorean Theorem the beam 2 has traveled a diagonal distance from M to M2 equal to


½√(4L22 + vt22).


The light beam 2 is reflected off the plane mirror M2 and travels diagonally from M2 back to M, which has traveled an additional distance vt2/2 through the aether. By the Pythagorean Theorem the reflected beam 2 has traveled an additional diagonal distance from M2 to M is also equal to


½√(4L22 + vt22),


so that the beam 2 is able to return to the (advancing) mirror M. Thus the reflected light beam 2 has traveled a total distance in time t2 equal to


√(4L22 + vt22).


Since the total distance that light beam 2 travels from M to M2 and back again is


ct2 = √(4L22 + vt22).


Therefore, the time t2 that light beam 2 travels from M to M2 and back again can be found by solving the previous equation for t2. Squaring that equation, we get


c2t22 = (4L22 + vt22),

or, solving for t22,

c2t22vt22 = 4L22   or,

t22 = 4L22/ (c2v2).


Thus the difference in time Δt = t1t2 taken by the two beams of light is therefore


Δt = t1t2 = 2/c[ L1/(1 – v2/c2) – L2/√(1 – v2/c2)],

or, if L1 = L2 = L, then

Δt = t1t2 = 2L/c[ 1/(1 – v2/c2) – 1/√(1 – v2/c2)].


Thus according to classical physics we see that there is difference in time between the two beams of light. This formula can be simplified by using the binomial expansion and dropping the terms higher than the second-order. We get the following formula for the difference in time between the two beams.

Δt = t1t2 = (L/c)(v2/c2).


The whole interferometer device was mounted on a stone table which was floated on a pool of mercury in a large tank, so that the whole device could be rotated through an angle of 90°. The interference pattern could be observed while the interferometer was rotated. The idea was that this rotation would change the speed of the aether wind along the arms of the interferometer, and consequently the fringe pattern would shift slightly but measurably. After rotating the device by 90°, so that the L1 is the cross-stream length and L2 is the up-down-stream length, the transit time difference now designated by primes will be


Δt′ = t1t2 = 2/c[ L1/√(1 – v2/c2) – L2/(1 – v2/c2)].


Now the rotation changes the time differences by


Δt – Δt′ = 2(L1 + L2)/c[ 1/(1 – v2/c2) – 1/√(1 – v2/c2)].


Using the binomial expansion and dropping terms higher than the second-order, we get


Δt – Δt′ = [(L1 + L2)/c] (v2/c2).


Therefore, the rotation should cause a shift in the fringe pattern, since it changes the phase relationship between beams 1 and 2. If the optical path difference between the beams changes by one wavelength, for example, there will be a shift of one fringe across the cross-hairs of the viewing telescope. Let N represent the number of fringes moving past the cross-hairs of the telescope as the pattern shifts. Therefore, if light of wavelength λ and of frequency f, so that the period of one vibration is T = 1/f = λ/c,


ΔN = (Δt – Δt′)/T = [(L1 + L2)/cT] (v2/c2) = [(L1 + L2)/λ] (v2/c2).


Michelson and Morley were able to obtain an optical path length, L1 + L2, of about 22 meters by multiple reflections. In their experiment the arms of their interferometer were of nearly equal length, that is,


L1 = L2 = L, so that,

ΔN = [2L/λ] (v2/c2) = [22 m/5.5 &time; 10-7m](10-8) = 0.4,

where λ = 5.5 × 10-7 meters and v/c = 10-4, since the speed of earth in its orbit v = 3 × 104 meters/sec and the speed of light c = 3 × 108

meters/sec. Thus ΔN is a shift of four-tenths of a fringe.

 

The Failure of the Experiment

But measurements failed to show any change in the interference pattern! On the chance that the earth was at rest relative to the aether at one point in its orbit (in which the time difference would not occur), Michelson and Morley repeated their experiment over a period of six months, constantly rotating their device. Not once did they observe the expected shift in the interference pattern. To Michelson this was a grave disappointment. He considered the experiment a failure. He repeated the experiment under various conditions, at different times and locations, but the results was always the same: no fringe shift of the magnitude required was ever observed. The immediate conclusion to be drawn from this failure of the Michelson-Morley experiment to detect any effect of the aether on the motion of light waves is that the speed of light in free space is a constant everywhere, completely independent of any relative motion of the aether. And a corollary of this conclusion is that the aether did not exist. Here was a huge puzzle. On the one hand, how could one explain the failure of the Michelson-Morley experiment and still believe in the existence of aether, and on the other hand, if aether did not exist, how could one explain how light was propagated without a medium, since the sole property of aether was its ability to support electromagnetic waves? The nineteenth century physicist were not willing to abandon the concept of the aether.

The negative result of the Michelson-Morley experiment were variously interpreted. Sir Oliver Lodge suggested that a layer of aether was dragged along by the earth as it rotated, so that the aether around the earth was relative at rest. He tried various experiments to verify his hypothesis but no experiment showed the existence of the aether drag. The Dutch theoretical physicist, Hendrik Antoon Lorentz (1853-1928), and an Irish mathematician, G. F. FitzGerald (1851-1901), independently suggested an ad hoc explanation of the negative result of the Michelson-Morley experiment. FitzGerald in 1892 proposed the hypothesis, which was elaborated by Lorentz, that all bodies contracted in the direction of motion relative to the stationary aether by a factor of √(1 – v2/c2). The motion of the earth through the aether caused a shortening of one arm of the Michelson interferometer in the direction of the earth’s motion by exactly the amount required to eliminate the fringe shift. Since a meter stick placed alongside any object would suffer the same fractional contraction, there can be no operational procedure for verifying or disproving this assertion of FitzGerald. Lorentz, who independently had thought of this possibility, justified the hypothesis in terms of a possible change of electromagnetic force between constituent atoms due to its motion. This length contraction, now known as the Lorentz-FitzGerald contraction, is also a consequent of Einstein’s special theory of relativity, but not, in his theory, an arbitrary axiom contrived to explain an otherwise incomprehensible observation.

 

The Principle of Relativity

All these interpretations of the failure of the Michelson-Morley experiment assumed that the aether existed, since it was necessary to explain how electromagnetic waves are propagated. But the failure of so many attempts to measure the velocity of the earth relative to the aether suggested to the brilliant mind of the French physicist Jules Henri Poincare (1854-1912) a new possibility. In his lectures at the Sorbonne in 1899, after reviewing the experiments so far, he proposed that the absolute motion of a body is not detectable in principle. In the following year, at an International Congress of Physics held in Paris, he asserted the same view. He said “Our aether, does it really exist? I do not believe that more precise observations could ever reveal anything more than relative displacements.” A new principle must be introduced into Physics, which would resemble the Second Law of Thermodynamics in so far as it asserts the impossibility of doing something; that is, the impossibility of determining the velocity of the earth relative to the aether. He published these views in April of 1904 and in a lecture to a Congress of Arts and Science at St. Louis, U.S.A., on 24 Sept., 1904, Poincare gave to a generalized form of this principle the name, The Principle of Relativity. He said, “According to the Principle of Relativity the laws of physical phenomena must be the same for a ‘fixed’ observer as for an observer who has a uniform motion of translation relative to him; so that we have not, and cannot possibly have, any means of discerning whether we are, or are not, carried along in such a motion.” And after reviewing the records of observations in the light of this principle, he says, “From all these results there must arise an entirely new kind of dynamics, which will be characterized above all by the rule, that no velocity can exceed the velocity of light.”

 

Coordinate Transformations

In 1895 H. A. Lorentz developed a set of equations for a moving electric system by applying a transformation to the fundamental equations of the aether. But in the original form of this transformation, quantities of order high than the first in (v/c) were neglected. In 1900 Sir J. Larmor extended the analysis so as to include quantities of the second order. Lorentz in 1903 went even further and obtain the transformation from one coordinate system to another moving with constant velocity v with respect to the first one so that the invariance of Maxwell’s equations is retained:


x′ = (xvt)/ √(1 – v2/c2),

y′ = y,

z′ = z,

t′ = (tvx/c2)/ √(1 – v2/c2),


In June, 1905, Poincare gave to these set of coordinate transformation the name of Lorentz transformations. These equations assume that the second frame of reference is moving in the same direction (x direction) as the first one and at a constant velocity v relative to the first frame of reference. Note that in these Lorentz equations distance and time are both involved together. The equations relating x and x′ and relating t and t′ are not the simple ones in the Galilean transformations.


x′ = xvt,

y′ = y,

z′ = z,

t′ = t.


These equations are the equations of coordinate transformation between two frames of reference moving relative to one another with a uniform translational motion, and supposes that an object is moving respect to both frames. With respect to the first frame, the object describes a particular path with some definite motion along that path. With respect to the second frame, the path and motion will be different. For mathematical purposes, each frame uses a coordinate system to specify the desired space frame of reference. Let us call the fixed frame K and the frame moving to its right at constant velocity K′. Both observers in the frames are supposed to have identical clocks. Now let P be a point in space. Its coordinates with respect to K′ are x′, y′ and z′, and with respect of K they are x, y, and z. Since the frame K′ is moving with velocity v, parallel to the positive x-axis of the coordinate system of the frame K, the equation of transformation between x and x′, is x′ = x + vt. The equation for the other coordinates are y′ = y, and z′ = z. These two frames are called Galilean or inertial frames. One moves relative to the other at a constant speed in a straight line. No relative acceleration or rotation between these frames are taken place. In Newton’s terms, Galilean frames are at rest or moving with uniform translational speed through absolute space and without acceleration or rotation. And it cannot be determined which one is at rest in absolute space, but this does not matter because we know the laws of transformation. Since we can speak only of the relative velocity of one frame with respect to another, and not of an absolute velocity of a frame, this principle is sometimes called Newtonian relativity. Transformation equations, in general, will change many quantities but will leave others unchanged. These unchanged quantities are called invariants of the transformation. In the Galilean transformation laws for the relation between observations made in different inertial frames of reference, acceleration, for example, is an invariant and – more important – so are Newton’s laws of motion. A statement of what the invariant quantities are is called a relativity principle; it says that for such quantities the reference frames are equivalent to each other, no one of them having an absolute or privileged status relative to the others. Newton expressed his relativity principle in the following way: “The motion of bodies included in a given space are the same amongst themselves, whether that space is at rest or moves uniformly forward in a straight line.” And in addition, the differential equations that hold in one frame also hold in the other. That is, the classical laws of mechanics are the same in both.

 

Frame of Reference

A physical event is defined as something that happens independently of the frame of reference that be used to describe it. For example, an event occurs when two particles collide with each other or when a tiny light source is turned on. The event happens at a point in space and at an instant of time. An event may be specified by four (space-time) measurements in a particular frame of reference, giving three position numbers x, y, z, and the time t when the event occurs. For example, the collision of the two particles occurred at x = 1 meters, y = 2 meters, z = 4 meters, and at time t = 6 sec in some frame of reference (in a laboratory on earth) so that the four numbers (1, 2, 4, 6) specify the event in that frame of reference. The same event observed from a different frame of reference (for example, an airplane flying overhead) would also be specified by four numbers, although the numbers may be different from those in the laboratory frame of reference. Thus in order to describe an event, the first step is to establish a frame of reference. Consider a physical event at point P, whose space and time coordinates are measured in each inertial frame of reference. An observer attached to the frame K specifies by means of a meter stick and clock the location and time of the occurrence of the event, ascribing space coordinates x, y, and z and time t to it.

An observer attached to the frame K′, using his measuring instruments, specifies the same event by space-time coordinates x′, y′, z′, and t′. The coordinates x, y, z will specify the position in space of P relative to the origin O as measured by the observer in K, and t will be the time of occurrence of P that observer in K records with his clock. Similarly, the coordinates x′, y′, z′ refer the position of event P to the origin of O′ and the time of P, t′ to the clock in inertial frame K′. Now in order to establish the relationship between the measurements in K and K′, the two inertial observers must use meter sticks, which have been compared and calibrated against one another, and the clocks, which have been synchronized and calibrated against one another. The classical procedures assumes that the length intervals and time intervals are absolute, that is, that they are the same for all inertial observers of the same events. For simplicity, let us set the clocks of each observer to read zero at the instant that the origins O and O′ of the frames of reference K and K′, which are in uniform relative motion, coincide. The relation between the measurements made in K and K′ are given by the Galilean coordinate transformations.


x′ = xvt,

y′ = y,

z′ = z,

t′ = t.


These equations assume that time can be defined independently of an particular frame of reference. This is the implicit assumption of classical Newtonian physics. It is made explicit by the equation t′ = t.   Now the time interval between two given events, say P and Q, is the same for each observer, that is,


tPtQ = tPtQ,


and that the distance, or space interval, between two points, say A and B, measured at a given instant, the same for each observer, that is,


xBxA = xBxA.


Note carefully that the two measurements (the end points of the space interval or distance) are made for each observer and that they are assumed that they were made at the same time (tA = tB, or tA = tB).  The assumption that the measurements are made at the same time, that is, simultaneously, is a crucial part of our definition of the length of a moving rod. Of course, we should not measure the location of the end points of the rod at different times to get the length of the moving rod.

 

Classical Mechanics

The time interval and space interval measurements described above are absolute according to the Galilean coordinate transformation; that is, they are same for all inertial observers, the relative velocity v of the frames being arbitrary and not entering into the results. And when we consider the assumption of classical physics that the mass of a body is a constant, independent of its motion with respect to an observer, then we can conclude that classical mechanics and the Galilean transformations imply that length, mass, and time — the three basic quantities of mechanics — are all independent of the relative motion of the observer (or measurer).

How do the measurements of velocity and acceleration compare when made by different inertial observers? Since the position of particle in motion is a function of time, the velocity and acceleration may expressed as time derivatives of position. By carrying out successive time differentiations of the Galilean transformation equations we can calculate the velocity and acceleration. If we differentiate the equation


x′ = xvt with respect to time t, we get

dx′/dt = dx/dtv.

But since t′ = t, the operation d/dt is identical to the operation d/dt′, so that

dx′/dt = dx′/dt′.

Therefore, dx′/dt = dx/dtv.

Similarly, dy′/dt = dy/dt, and

dz′/dt = dz/dt.

Now let dx′/dt′ = ux, the x-component of the velocity measured in K′, and dx/dt = ux , the x-component of the velocity measured in K, and substituting we get the classical velocity addition theorem.

ux = uxv,

uy = uy,

uz = uz.

To obtain the acceleration transformation we just differentiate these equations for velocity addition. We get

dux = dux, since v being a constant,

duy = duy, and

duz = duz.

That is, ax = ax,

ax = ax, and

ax = ax.

Hence, a′ = a. That is, the measured components of acceleration of a particle are unaffected by the uniform relative velocity v of the reference frames.


The velocity measured in different inertial frames by different observers who are in relative motion will differ by the relative velocity of the two observers, which is in the case of inertial observers is a constant velocity. Now if the particle velocity changes, the change will be same for both observers. That is, each observer will measure the same acceleration for the particle. The acceleration of a particle is the same in all reference frames which are moving relative to one another with constant velocity; that is, a′ = a.

Since in classical physics the mass is also unaffected by the motion of the reference frame. Hence, the product ma will be the same for all inertial observers. If F = ma is taken as the definition of force, then each observer obtain the same measure for each force. If F = ma, then F′ = ma′ and F = F′. Newton’s laws of motion and the equations of motion of a particle would be exactly the same in all inertial systems. And since, in mechanics, the conservation principles of energy, momentum and angular momentum, all can be shown to be consequences of Newton’s laws, it follows that the laws of mechanics are the same in all inertial frames. That is, not only is acceleration an invariant, but so is Newton’s laws; they are the same for all inertial observers.

 

Maxwell’s Equations and Galilean Transformations

Now let us consider Maxwell’s equations. Let us inquire whether the laws of electromagnetism (and any other laws of physics in addition to those of mechanics) are invariant under Galilean transformations. At the end of the nineteenth century, it was widely believed that the same partial differential equations held in any Galilean frame. That is, it was believed that Newtonian relativity principle would hold not only for mechanics but for all of physics. In other words, no inertial frame would be preferred over any other and no type of experiment in physics, not merely mechanical ones, carried out in a single frame would enable us to determine the velocity of our frame relative to an other frame. There would then be no preferred, or absolute, frame of reference. Thus, it was believed that it was true for electromagnetics as in Newtonian mechanics. But this was not true. When the Galilean laws of transformation were applied to the electromagnetic equations in K in order to obtain them in K′, it was found that the equations were modified by adding terms that involved the relative velocity of the two frames. Maxwell’s equations are not preserved in form by the Galilean transformations. The reason for this is that the velocity is not invariant and Maxwell’s equations contained the constant


c = 1/√(μ0ε0),


the speed of light, the velocity of propagation of a plane wave in vacuum. But such a velocity cannot be the same for observers in different inertial frames, according to the Galilean transformations, so that electromagnetic effects will probably not be the same for different inertial observers. Consider a light signal or pulse sent to the right with the velocity c and another sent to the left with the velocity c. An observer moving to the right with velocity v is “catching up” to the light signal, and so for that observer the signal has the velocity cv.

On the other hand, this observer is running away from the second light signal, and has velocity relative to the signal of c + v. That is, in a frame K′ moving a constant velocity v with respect to the aether frame, an observer would measure a different velocity for the light pulse, ranging from c + v to cv depending upon the direction of relative motion, using the Galilean velocity transformation. For the moving observer the two light signals do not propagate with the same velocity, and so Maxwell’s equations do not have the same form for the observer. As far as Maxwell’s equations are concerned, there is only one preferred frame, the frame at rest with respect to the aether.

Thus the transformation of Maxwell’s electromagnetic equations from one frame of reference to another moving with a constant velocity with respect to the first frame by the Galilean transformation, showed that Maxwell’s equations did not behave in the same way as Newton’s laws of mechanics. The fact that the Galilean transformations does apply to Newtonian laws of mechanics but not to the Maxwell’s laws of electromagnetism requires us to choose one of the following alternatives.


1.  A relativity principle exists for mechanics, but not for electrodynamics. In electrodynamics there is a preferred inertial frame; that is, the aether frame. If this alternative be true, the Galilean transformation would apply and would be possible to find the aether frame experimentally.


2.  A relativity principle exists both for electrodynamics and for mechanics, but the laws of electrodynamics given by Maxwell are not correct. If this alternative is true, it should be possible to perform experiments that show deviations from Maxwell’s electrodynamics and to reformulate the electromagnetic laws. The Galilean transformations would apply to these reformulated laws.


3.  A transformation principle exists both for electrodynamics and for mechanics, but the laws of mechanics as given by Newton are not correct. If this alternative is true, it should be possible to perform experiments that would show deviations from Newtonian mechanics and to reformulate the mechanical laws. In that case, the correct transformation laws would not be the Galilean transformation laws, but some new ones which are consistent with classical Maxwell’s electromagnetism and the new mechanics.


Alternatives 1 and 2 must be rejected on experimental bases, and alternative 3 turns out to be true one and the new transformation laws are the Lorentz transformation equations. The Michelson-Morley experiment showed that aether hypothesis is untenable and by experiment the laws of electrodynamics are correct and do not need modification. The speed of light (an electromagnetic radiation) is the same in all inertial systems, independent of the relative motion of source and observer. Hence the relativity principle, applicable both to mechanics and to electromagnetism, is operating. It cannot be the Galilean principle, since that required the speed of light depends on relative motion of source and observer. The Galilean transformations must be replaced and, therefore, the basic laws of mechanics, which were consistent with the Galilean transformations, must be modified. The Galilean transformation equations was replaced by the Lorentz transformation equations and the modified basic laws of mechanics was done by Albert Einstein.

 

Einstein’s Special Theory of Relativity

In 1905, before many of the experiments were performed, Einstein (1879-1955), apparently unaware of several important papers on the subject, published a paper, titled “On the Electrodynamics of Moving Bodies”, in the same volume of the Annalen der Physik, as his paper on the Brownian motion. He sets forth the relativity theory of Poincare and Lorentz with some modifications and he asserts as a fundamental principle the constancy of the velocity of light, that is, that the velocity of light in vacuo is the same in all systems of reference which are moving uniformly relatively to each other. He wrote,

“…for all coordinate systems for which the mechanical equations hold, the equivalent electrodynamical and optical equations hold also. … In the following we make these assumption (which we shall subsequently call the Principle of Relatively) and introduce the further assumption — an assumption which is at the first sight quite irreconcilable with the former one — that light is propagated in vacant space, with a velocity c which independent of the nature of motion of the emitting body. These two assumptions are quite sufficient to give as a simple and consistent theory of electrodynamics of moving bodies on the basis of the Maxwellian theory for bodies at rest.”

 

These two assumptions of Einstein may be restated as follows.


1.  The laws of physics are the same in all inertial systems. No preferred inertial systems exists. (The Principle of Relativity)


2.  The speed of light in free space has the same value c in all inertial systems. (The Principle of the Constancy of the Speed of Light)


The entire Special Theory of Relativity, as it was called, is derived directly from these two assumptions. Their simplicity, boldness, and generality are characteristic of the genius of Einstein. From these assumptions not only was it able to explain all existing experimental results but to predict new effects which were confirmed by later experiments. No experimental objections to the Special Theory of Relativity has yet been found. Einstein explains that these postulates seem paradoxical only if we believe in an absolute definition of time intervals and lengths. He carefully examined how one measures times and lengths, and concluded that his postulates were not logically irreconcilable; but, if they are true, one must abandon the idea of absolute time and absolute space.

Einstein determined three relationships, using the Lorentz transformations, between two inertial systems (that is, systems between which there is no relative acceleration or rotation) that are moving with a velocity v relative to one another.


1. 
Time Dilation. If two events appear simultaneous to one observer, they will not be simultaneous to another observer who is moving uniformly relative to the first one. Quantitatively, if the time interval between two ticks of a clock (or any two events) is T0 according to an observer at rest relative to the clock, the time interval T according to an observer moving uniformly relative to the clock at speed v is:


T = T0/ √(1 – v2/c2) = γT0.


That is, the time interval T is longer than the time interval T0 as it approaches the speed of light by the factor γ = 1/√(1 – v2/c2).
This effect is called time dilation.   That is, a clock in a moving system as observed from another “stationary” system, is slowed down by an amount γ, as the moving system approaches the speed of light relative to the observer. This implies that any natural process, such as the oscillations of a pendulum or the beating of the heart, will be slower by a very small amount when it is in motion relative to a frame of reference in which a “standard” clock is maintained. This effect is completely negligible with ordinary velocities, even in the case of the seeming “high” velocities of jet planes or rockets. But in atoms traveling at speeds approaching the speed of light as in a particle accelerator the period of vibrations within them will be distinctly longer than when those same atoms are at rest or moving slowly.


2.  Length Contraction. Lengths of moving objects, on the other hand, are contracted. An object of length L0, measured at rest, will have a shorter length, L, measured in the direction of the motion by a moving observer. If the observer is moving at speed v parallel to the object, then


L = L0 √(1 – v2/c2) = αL0.


That is, the length L is shorter than the length L0 as it approaches the speed of light by a factor α = √(1 – v2/c2).   This effect is called length contraction and is also known as Lorentz-Fitzgerald contraction.


3.  Relativistic Addition of Velocities. Einstein showed that time dilation and length contraction do not lead to any contradictions, and have many interesting consequences, including the relativistic addition of speeds. If an object moves at speed v1 relative to some observer, and a second object moves at speed v2 relative to the first object; the second object moves a speed v relative to the first observer given by


v = (v1 + v2)/ (1 + v1v2/c2).


If v1 and v2 are both less than c, and so is v. If either v1 or v2 is equal to c, then v = c.  As a corollary to this relationship it follows that no material object can have a velocity greater than the velocity of light.

 

Relativistic Momentum and Mass

The fact that it is impossible to accelerate an object to a speed above the speed of light means that Newton’s law of motion must in some way be modified. Since Newton’s laws work well for almost all normal applications, the modified laws should differ from them only for speeds near the speed of light. As an object’s speed approaches the speed of light, it becomes harder and harder to accelerate it. The same force results in a smaller and smaller change in the speed as the object’s speed approaches the speed of light. That is, in other words, an object’s resistance to acceleration, its inertia, increases. Einstein found that Newton’s Second Law (“Force equals the time rate of change of momentum”) could be saved only if the meaning of momentum were modified. And since momentum p is defined as mv, mass times velocity, the relativistic version of momentum must be:


p = mv = m0v/ √(1 – v2/c2) = γm0v,

where γ = 1/√(1 – v2/c2).


And for the classical definition of momentum as mass times velocity to hold, then the definition of mass must be modified as follows:


m = m0/ √(1 – v2/c2) = γm0,


where m0 is the mass of the body at rest with respect to the observer and m is the mass of the body moving at the speed v relative to the observer. It is obvious that this definition of mass is negligibly different from the classical concept except at very high velocities, since the ratio


v2/c2


is ordinarily such a small fraction; the mass of moving body is only significantly larger when the speed of the body v approaches the speed of light c.

As v approaches c,

v2/c2 becomes almost one, and

√(1 – v2/c2) becomes close to zero.


And since it is in the denominator of this formula for relativistic momentum and inertia, these quantities become very large so that the resistance to acceleration becomes very large also. If the speed v were equal to c, then its inertia (mass) and momentum becomes infinite.

Einstein showed that with this definition Newton’s Third Law still holds for collisions between massive bodies. The total momentum of a system of colliding bodies is unchanged by the collision; and further, only with this relativistic definition of momentum is momentum conserved according to the observer, no matter with what speed he is moving relative to the bodies. And for bodies who act on each other at a distance, the situation is a little more complicated. For example, the electric and magnetic forces are propagated by electro-magnetic fields. Einstein showed that Maxwell’s complete theory of electro-magnetic fields, together with the waves that can propagate with the field, is consistent with the Special Theory of Relativity. To deal with gravitational forces Einstein had to develop his General Theory of Relativity.

 

Relativistic Energy

Einstein was impelled by his new transformation equations relating time, space and velocities as measured by two separated and moving systems of reference, to apply the transformation equations to kinetic energy and radiant energy. In a second paper in 1905, titled, “Does the Inertia of a Body Depend Upon Its Energy Content?” Einstein showed that there is a form of energy associated with mass of a particle and defined the total energy E of a particle of mass m0 and the speed v to be


E = mc2 = m0c2/ √(1 – v2/c2) = γm0c2.


This definition of the total energy of a particle results in the conservation of energy for a isolated system; but it does not include potential energy. Notice that the total energy of the particle is not zero when the particle is at rest. Setting v = 0 results in


E = m0c2.


This is the energy of the particle when it is at rest, and it is called the rest-mass energy of the particle,

E0.

For small velocities compared with velocity of light, the expression

γ = 1/√(1 – v2/c2) = (1 – v2/c2)

may be expanded by the binomial theorem:

(x + y)n = xn + nxn-1y + [n(n-1)/(1·2)]xn-2y2 + …,

to the series

(1 – v2/c2) = 1 + ½(v2/c2) + 3/8(v4/c4) + …

with continuously higher powers of v/c.


Since v/c is a small fraction, the term with v4/c4 and the higher terms may be neglected, and the expression above for total energy becomes then


E = m0c2[1 + ½(v2/c2)] = m0c2 + ½m0v2.


This was interpreted by Einstein to mean that the energy associated with any particle is composed of two types: its permanent “rest energy”


m0c2


and its classical kinetic energy ½m0v2.  The last term on the right side of this equation has the same form as the classical kinetic energy K.
But the first term on the right side


m0c2


is probably the best known and the most dramatic consequence of the Special Theory of Relativity. It is called the “rest energy” E0 of the particle and it identifies the mass of the particle with energy. The rest energy is by definition the energy of the particle at rest, when v = 0 and K = 0. The total energy of the particle is the sum of its rest energy and its kinetic energy:


E = E0 + K.


This identification of mass and energy as given in the form of the famous equation


E = mc2


governs the transformation of mass into energy and vice versa. Through the application of this equation many previously unexplained physical phenomena of the universe, such as the apparently inexhaustible source of heat of the sun and the stars, transmutation of radioactive elements, and other nuclear processes, were understood. And its application led to the development of atomic, or more accurately, nuclear energy. It does not tell us how to convert mass into energy, but identifies the amount of energy that equivalent to the mass of the particle.

 

Experimental Verification of the Theory

The theory of relativity was designed to agree with the experimental fact that the velocity of light is observed to be the same in frames of reference which are in uniform translation with respect to each other. But in addition to having achieve this, the theory predicts a number of new phenomena, such as time dilation, length contraction, relativistic increase in mass, and a relation between mass and energy. This is to be expected from a scientific theory. A scientific theory is initially accepted tentatively until the predictions of the theory is verified by experiment. The first verification of the special theory of relativity was performed in 1909 by Bucherer. It consisted of a measurement of the masses of high velocity electrons. It showed that the mass of the electron increased according the relativistic mass increase relation. A number of other experimental verification of length contraction and time dilation have been performed. One of the clearest of these involved the measuring the lifetime of unstable particles call mesons at various velocities. The relation between mass and energy of relativity has been verified by an overwhelming amount of evidence, the atomic bomb and nuclear energy. The predictions of the special theory of relativity have been confirmed at every point, and there is now universal acceptance of its validity.