Caroline Jones 13JLB

Mathematical derivations of Kepler's Laws of Planetary Motion and the equations of planetary orbits

Contents

1.Introduction

2.Ellipses

i.Cartesian form

ii.The elliptical properties of Jupiter’s orbit

iii.Polar form

3.Kepler’s First Law

4.Kepler’s Second Law

5.Kepler’s Third Law

6.Conclusion

7.Bibliography

1.Introduction

I have always been fascinated by the popular science of Astrophysics, mainly from watching programs of Stephen Hawking and Brian Cox.From very on in the history of the human race, people have been perplexed by astrological phenomena, therefore it is one of the very oldest branches of Physics, making it a very interesting subject to study historically as well. It was also often associated with Philosophy, another of my favourite subjects, with some of the most ancient Astronomers also studying Philosophy (such as Thales, Anaximander and Aristotle), as both disciplines were considered to be some of the highest branches of academia.

I was introduced to Kepler’s Laws in my Physics lessons, where we learnt a simple derivation for Kepler’s Third Law using equations of motion. I found the concept interesting, but the proof unsatisfying. This is because this particular proof approximates the equation of a planet’s orbit to a circle, whereas Kepler’s First Law tells us that orbits are ellipses. I therefore wanted to find a more general derivation that showed that Kepler’s Third Law is true for all ellipses, not just specific to circles. For this reason, I decided to research how Kepler’s Laws of Planetary Motion[1] can be mathematically derived.

These derivations are fascinating for me as they combine two of my favourite topics in maths – geometry and calculus. Researching planetary orbits led me to examine conic sections: the shapes formed from the intersection of a plane and a circular cone at different angles, as shown in Figure 1[2]. Kepler’s Laws looks specifically at ellipses, and over the course of this essay I will explore the properties of ellipses and their link to the orbits of planets, using predominately calculus and algebra to prove Kepler’s three laws. In doing so, my aim is to make the mathematics and physics of the derivations clear to understand for an audience of my peers studying Maths and Physics at Higher Level, as this is not covered by the standard curriculum, and not something that I have been able to find in the many sources I consulted. I hope to share my enthusiasm and make the ideas that I describe clear and engaging.

Over the course of this essay, I aim to make sense of the following:

  • How equations of circles and ellipses come about, in different coordinate systems
  • Kepler’s First Law: The orbit of a planet is an ellipse with the Sun at one of the twofoci.
  • Kepler’s Second Law:A line segment joining a planet and the Sun sweeps out equal areas during equal intervals of time.
  • Kepler’s Third Law:The square of the orbital period of a planet is proportional to the cube of the semi-major axis of its orbit.

2.Ellipses

i.Cartesian form

Kepler’s Laws of Planetary motion are rules that describe the orbits of planets, therefore it seems necessary to me to first explore the equations that can model the orbit of planetary motion.

In their most basic form, the orbits of planets are approximated to circles, which are modelled by the equation:

where r is the radius of the circle, the centre of the circle is the origin, and is any point on the circle. I do not think I will need to consider the more complicated equation for a circle not centred on the origin, withcentre , because in a planetary system we can define the centre to be anywhere, therefore it is convenient to use the origin.

Figure 2 shows an example of a circle with radius of 1.

However, orbits are more accurately defined by ellipses, modelled by the Cartesian equation (named after the mathematician and philosopher René Descartes):

where a is the horizontal semi-axis (the distance from the centre to the point on the ellipse with the same y-coordinate),

b is the vertical semi-axis (the distance from the centre to the point on the ellipse with same x-coordinate),

and the centre of the ellipse is the origin.

This can be confusing as “axis” is usually associated with the x-axis, which is a line against which we measure other shapes, whereas here the semi-axes are distances.

Although many books derive this by using trigonometry, I think it is possible to derive this equation rather neatly using only transformations of the equation for a circle. In an equation, replacing by corresponds to a stretch of scale factor in the horizontal direction, and replacing by corresponds to a stretch of scale factor in the vertical direction.

One of the semi-axes will be the semi-major axis(a), which is the point furthest from the centre, and one will be the semi-minor axis(b), which is the point closest to the centre.It is conventional to draw the horizontal semi-axis as the semi-major axis and the vertical as the semi-minor axis; it is helpful to have conventions likes this in Maths as it aids mathematicians in making descriptions of shapes universally understandable.

Figure 3shows a graph of an ellipse, with .

The area of an ellipse is given by

If we use the idea of obtaining the shape of an ellipse using coordinate transformations, this formula is also easy to derive. The area of a circle of radius 1 around the origin is , and when we stretch the semi-major axis by scale factor , and the semi-minor axis by scale factor , this becomes .

An ellipse can also be defined by two points called itsfoci[3], which always lie along the semi-major axis. An ellipse with foci and has the property that for any point P on the ellipse, , as depicted in the Figure 4. This constant is equal to . We can understand this by imagining when the point P is at and is negative, , and . Therefore . We can deduce from this that the distance from to the positive y-axis is , as it is half the sum of the distances from the foci to the point on the ellipse.

Furthermore, I wish to prove as follows, using a proof that my Maths teacher helped me to derive. If we define the point P to be then using Pythagoras, we know that and

I shall now try to rewrite the first square root all in terms of and , using the Pythagorean relationship that we can see in Figure 4 that and rearranging to get :

This simplifies as follows:

Similarly therefore we know that the second square root will rearrange and simplify to:

So is equal to:

These notions of the foci of ellipseswill link to Kepler’s First Law of planetary motion, which states that the orbit of a planet is an ellipse with the Sun at one of the two foci.

An important property of an ellipse will be its eccentricity, which can be thought of as how much a conic section deviates from being circular. Eccentricity (let us call it e) is defined by the following equation:

as shown in Figure 4[4].For an ellipse 0 < e < 1, and in the case that e = 0 the equation is a circle, since the focus has a distance of zero from the centre.

Alternatively, the eccentricity can be found using the following property (which will later be used in Kepler’s Third Law):

We can show this beginning with the relation and (using Pythagoras) substituting in :

ii.The elliptical properties of Jupiter’s orbit

The links between all of these variables and properties is better seen using an example, so I shall try to model the orbit of Jupiter using these equations and plotting these values on autograph.

Jupiter:[5]Semi-major axis, a = km

= 5.203 AU

(where AU is Astronomical Units, a unit of length defined as the distance from the Earth to the Sun)

Eccentricity, e = 0.048

In order to find the distance of the focus to the centre, we can use the equation , and substitute in other known equations and values until everything is in terms of f.

0.250 AU to 3dp

Now that we have the distance to the foci, we can also work out the semi-minor axis, since (using Figure 4) we can see that and have a Pythagorean relationship:

I have then plotted a graph on Autograph in order to visualise this data, shown in Figure 5. However since the eccentricity is very small, the orbit does not look very noticeably elliptical. Therefore perhaps it would have been better to have chosen a planet with greater eccentricity, for example Pluto (which I am delighted to be able to describe as a planet again having been officially reinstated)!

I have drawn arrows showing the semi-axes, and plotted and marked the points for the foci (one of which would be the Sun, according to Kepler’s First Law).

iii.Polar form

However, an ellipse can also be defined using the polar equation – which is a different way of expressing the same shape.

Using Figure 6[6], let us start by defining (or the Sun) to be the origin, and to be the point – meaning that it is a distance of from on the x-axis. Normally mathematicians use the instead of , but I thought I would use for the sake of consistency, as this is how I have previously defined the distance from a focus to the origin, therefore the distance between the foci is . We also define to be the vector , therefore is (where is the horizontal unit vector).

We know that , therefore:

Since, we can work out that , and so now, to get rid of the modulus signs, we can say that:

The vector can also be written by its polar co-ordinates as

Note that the change in the horizontal component of is subtracted from other horizontal component (ie. ) and not the vertical component (ie. ). We can now rearrange to make the equation in terms of .

Since , we can imagine when the point P is on the positive y-axis, meaning that the distance is the hypotenuse of the sides and . Therefore . Therefore:

We have already defined the eccentricity to be , therefore:

The semi-latus rectum is the length of the segment perpendicular to the major axis through one of the foci to the ellipse, as shown in Figure 7[7]The semi-latus rectum is equal to , as I shall try to prove.

Since the eccentricity is the ratio of the distance to the foci from the origin and the semi-major axis (ie. ), the distance to the foci can be said to be, meaning that the coordinates for the two foci are at and . Since the semi-latus rectum is the segment perpendicular to the the semi-major axis to the ellipse, the equation of this line segment is just or . The intersection of these equations with the ellipse is the end point of the semi-latus rectum, therefore the y-coordinate is the length . So we can substitute in into the Cartesian equation for the ellipse:

As we have already stated that , we can substitute in to get

Substituting into the previous equation:

Therefore substituting this into the previous equation for an ellipse, , the polar equation for an ellipse can be given by:

Two other important features of an ellipse are the apoapsis (the longest distance from a focus to ellipse) and the periapsis (the shortest distance from a focus to the ellipse). Looking at Figure 8[8] we can see that both the longest and shortest distances are going to lie on the x-axis. At the apoapsis and periapsis, the distance from the focus to the x-axis is equal to .

This can be proven using some of the properties of an ellipse that I looked at earlier, rearranging the equation to give and substituting this in to our equation for and :

This makes sense as it states that from either focus, the distance to the point on the ellipse cutting the x-axis is equal to the distance travelled to the centre (, plus the length of the semi-major axis ().

After having looked at ellipses and their relation to Kepler, I would now like to look at a mathematical derivation of Kepler’s First Law. I will be basing my derivation on a paper that worked through it with very little detail, whilst adding in more detail in order it make it accessible to other students.

3.Kepler’s First Law

The total energy for a body in orbit can be given by

where m is the mass of the body in orbit, M is the mass of the star being orbited, v is the velocity of the orbit, and G is the universal gravitational constant.

Looking at Figure 9[9]which depicts this system, we can break the velocity into two components:

(where ω is the angular velocity)

These two components are orthogonal (meaning that they are at right angles to one another) therefore Pythagoras’ Theorum can be used: the square of the total velocity is equal to the sum of the squares of the componants.

Now if we look at the equation for angular momentum, L, we know that

We shall now create the substitution that , so that

Remembering that , we can integrate with respect to to get an expression for :

Using related rates of change we can simplify this, as

and implies that , therefore

So substituting in this value for (or ):

This gives us the equation in which we will eventually substitute in an equation for .

We shall now put this aside for now and return to our equation for the total energy, which rearranges as follows:

It is now convenient to substitute in the equation previously established in order to get rid of the , as this has no place in the final equation for an ellipse.

Since , and cancel out, we can therefore say:

We now make two new substitutions, as this equation has many variables that are not relevant in the equation of an ellipse in polar co-ordinates, and lack two vital parameters: and . Therefore we need to use equations connecting some of the current variables to these constants, which should eventually rearrange to produce the equation for an ellipse.

The first equation we can use is

and the second equation is

both of which can be derived using the Conservation of Energy, although I will not do so, as the proof of these equations are not directly relevant. I had initially hoped to be able to explain them by suggesting how they are intuitively correct, but I could not find any instinctive reason as to why these are true, therefore I will not include any derivations for either equations but have included links to explanations in the footnotes.

Rearranging the equation for into an equation for with more relevant parameters for an ellipse:

We can take out the and square root the whole equation, as will cancel out when we eventually plug the equation for into .

We can use the substitute here that ,

Rearranging to make the subject, we can substitute again to get

Ideally we now want to make into one fraction, as this will make subsitution easy, therefore we can multiple the top and the bottom of the first fraction by :

And rewrite this as:

And conveniently our other substitution, , can easily be used here:

It is important here that we have written in the form , as now when we substitute into , the L and m will cancel out and we can use the general rule that as follows:

Finally this can be rearranged to form the polar co-ordinates of an ellipse, with the the origin (defined to be the Sun) at one of its foci, which is Kepler’s First Law.

4.Kepler’s Second Law

Due to the nature of an ellipse, at some points in the orbit the planet will be experience stronger gravitational force than at others, as gravitational force is a function of the distance between the two masses. This causes the planet to travel at a faster velocity at some points (when it is close to the Sun) and slower at others (when it is further away from the Sun). However Kepler’s Second Law is very statisfying in my opinion, as it states that an imaginary line segment joining a planet and the Sun always sweeps out equal areas during equal intervals of time. This can be demonstrated best on a diagram, as shown in Figure 10[10].

This seems logical when considered as, when the planet is furthest from the Sun, it will have a smaller velocity, therefore cover a small distance in a given time period. However the distance from the Sun will be greater, so the other dimension of the triangle will be larger. Equally, when close to the Sun, the distance from the Sun is obviously smaller but the velocity will be greater, so the displacement about the Sun will be greater in the same given time period. Therefore it is easy to believe that Kepler’s Second Law should be approximately true once told, but slightly more difficult to prove. I shall try to do so below, using infentissimals and calculus.

In Figure 11[11], we see the two successive positions (A and B) of a planet in orbit in a time , with a changing distance to the Sun, which we shall call. In this time it has moved through displacement , and therefore travelled through an angle.

A line segment from B can be drawn to a point C such thatBC is perpendicular to SB, in order that AC is the change in radius, .

Using trigonometry, we can state that . However as ds becomes infinitely small, BCbecomes perpendicular to SC as well as to SB, meaning we can approximate the distance BCto , as for any very small angle , tends to.