Classical and Celestial Mechanics

Classical mechanical systems, including Newtonian gravitational systems (celestial mechanics), are the primary historical source of problems in dynamical systems. The development of the Hamiltonian and Lagrangian methods, described below, expanded this area of study into new practical and mathematical realms, such as quantum mechanics in the 20th century.

Lagrangian systems

The Lagrangian method in mechanics historically comes from the calculus of variations, which is the study of functions that are extremal or at least stationary to perturbations with respect to some criteria. The standard set up for this is that we have given endpoints a and b in a space (which we will take to be \mathcal{R}^{n}, although we could be working in local coordinates on a more general manifold), and we wish to find critical points of the action functional, defined through the Lagrangian \mathcal{L}(q,\dot{q}) which is a function of the position variables q = \{q_1, \ldots q_n \} and tangent space variables \dot{q} = \{ \dot{q}_1, \ldots , \dot{q_n} \}. In the functional, we integrate along a path q(t)

\int_a^{b} \mathcal{L}(q,\dot{q}) dt .

If we consider all differentiable paths going between two fixed points from time a to time b, the paths which are critical points of the action functional must satisfy the Euler-Lagrange equations

\frac{\partial \mathcal{L}}{\partial q_i} - \frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot{q}_i} = 0 .

For simple mechanical systems the Lagrangian is usually the kinetic energy T minus the potential energy V: \mathcal{L} = T - V, although the general theory of these systems does not require Lagrangians to always be of that form.

Lagrangian example: The Double Pendulum

As an example of the Lagrangian approach we will consider the planar double pendulum, in which we have two masses m_1 and m_2 connected by massless rigid rods of lengths l_1 and l_2 acted upon by a constant gravitational acceleration of -g. The first rod is attached to an unmoving pivot point. In contrast to the single pendulum model, which has relatively simple periodic orbits, the dynamics are much more complicated.

There are many variations of the double pendulum, such as the double compound pendulum which has two distributed (rather than point) masses; our choice is for simplicity rather than realism, since the qualitative features of the dynamics are similar for all double pendulum models.

We will use angular coordinates, \theta_1 and \theta_2 measured from the vertical, for the respective two masses. This makes the kinetic energy expression a little complicated:

T = \frac{m_1 + m_2}{2} l^2_1 \dot{\theta}^2_1 + \frac{m_2}{2} l^2_2 \dot{\theta}^2_2 + m_2 l_1 l_2 \dot{\theta}_1\dot{\theta}_2 \cos\left(\theta_1 - \theta_2 \right)

but the potential is relatively simple:

V = -\left(m_1 + m_2\right) l_1 g \cos\left(\theta_1\right) - m_2 l_2 g \cos\left(\theta_2\right).

The Lagrangian is \mathcal{L} = T - V. We can simplify things a little by using dimensionless parameters \mu = m_2/m_1, \tau = t \sqrt{\frac{g}{l_1}}, and \lambda = \frac{l_1}{l_2}.

The first Euler-Lagrange equation is

(1 + \mu) \theta_{1}'' + \mu \lambda \left( \theta_{2}'' \cos \left(\theta_2 - \theta_1 \right) - \left( \theta_{2}' \right)^2 \sin \left(\theta_2 - \theta_1 \right) \right) + (1 + \mu) \sin \left(\theta_1\right) = 0

and the second is

\mu \lambda^2 \theta_{2}'' + \mu \lambda \left( \theta_{1}'' \cos \left(\theta_2 - \theta_1 \right) + \left( \theta_{1}' \right)^2 \sin \left(\theta_2 - \theta_1 \right) \right) + \mu \lambda \sin \left(\theta_2\right) = 0.

These equations unfortunately contain linear combinations of the second derivatives. For both theoretical and numerical purposes it is helpful to isolate the individual second derivatives and then convert to a first-order system of four equations introducing the velocities w_1 = \theta_1' and w_2 = \theta_2 ', and

w_1' = -\frac{c_{12} \mu s_{12} w_{1}^{2} + l \mu s_{12} w_{2}^{2} + c_{12} \mu s_{2} - {\left(\mu + 1\right)} s_{1}}{s_{12}^2 \mu - 1}
w_2' = \frac{c_{12} l \mu s_{12} w_{2}^{2} - {\left(c_{12} \mu + c_{12}\right)} s_{1} + {\left(\mu w_{1}^{2} + w_{1}^{2}\right)} s_{12} + \mu s_{2} + s_{2}}{l ( s_{12}^2 \mu - 1)}

where we have abbreviated s_1 = \sin(\theta), c_{12} = \cos(\theta_2 - \theta_1), s_{12} = \sin(\theta_2 - \theta_1).

A particular case with \mu=2 and l=0.8 is shown below.

Hamiltonian systems

A Hamiltonian system is an even-dimensional system that can be written in terms of two vector quantities p and q, with equations of motion

\dot{q} = \frac{\partial H}{\partial p}
\dot{p} = -\frac{\partial H}{\partial q}

If we change coordinates (possibly in a time-dependent way) to (Q,P,t) = \Psi(q,p,t) such that the form of Hamilton's equations are preserved (i.e. \dot{Q} = \frac{\partial \hat{H}}{\partial P} and \dot{P} = -\frac{\partial \hat{H}}{\partial Q} where \hat{H} is the Hamiltonian in the new coordinates), then we say the transformation \Psi is a canonical transformation. The Jacobian matrix M = D \Psi of a canonical transformation has some special properties, and we call such matrices symplectic matrices. A matrix M if defined to be symplectic if it satisfies the identity

M J M^T = J

where J is the skew matrix

J = \left ( \begin{array}{cc} 0 & I \\ -I & 0 \end{array} \right )

in which I is the n by n identity matrix.

Celestial Mechanics

Isaac Newton portrait
Isaac Newton

The study of celestial mechanics has produced an enormous amount of mathematics, including the invention of calculus and differential equations by Isaac Newton in his efforts to understand the motion of the moon. The convergence of power series was studied by Augustin-Louis Cauchy partly in order to understand approximations to Kepler's equation in the two-body problem. Perhaps most notably for this text, Henri Poincare invented a number of tools and concepts in dynamical systems and topology while considering stability in the solar system and related N-body problems.

Newtonian gravitation model

While secluding himself at his parent's country farm to escape the plague in 1666, Newton began formulating a new mathematical approach to understand the motion of the moon and other gravitational interactions. Eventually in 1687 he published his great work Philosophiae Naturalis Principia Mathematica.

Newton's model for gravitation is that any massive body is acted upon by others by a force proportional to the inverse square of the distance between them, and proportional to the product of their masses:

m_i \ddot{q_i} = \sum_{j \neq i} F_{i,j} = \sum_{j \neq i} \frac{G m_i m_j (q_j - q_i)}{|q_j - q_i|^3}

In standard units the constant of proportionality G \approx 6.6743 \cdot 10^{-11} \frac{N m^2}{kg^2}, first measured accurately by Henry Cavendish.

It is common to denote the distances |q_j - q_i| by r_{i,j}.

An important realization of Newton (and perhaps the first use of multivariable integration) was that outside a spherically symmetric body its combined gravitational influence is identical to a point mass. Since most massive celestial bodies such as our moon or the Earth are approximately spherical, it is often an excellent approximation to treat them as point masses (unless we want to model close interactions, such as the long term behavior of a near-Earth satellite).

The forces in the Newtonian model can be considered to come from a potential function

U = \sum_{i<j} \frac{m_i m_j}{r_{i,j}}

In some literature this function is given a negative sign, since we usually want to consider the potential energy to be V = - U. With our convention, m_i \ddot{q_i} = \nabla_i U. Then the total energy is

h = K + V = \frac{1}{2} \sum_i m_i |q_i|^2 - \sum_{i< j} \frac{m_i m_j}{r_{i,j}}

which is a conserved quantity.

The Two-Body Problem

If there are only two bodies interacting we can more or less completely solve the problem. The first key idea is to reduce the equations to a single body by noting that the center of mass c = \frac{m_1 q_1 + m_2 q_2}{m_1 +m_2} satisfies the equation \ddot{c} = 0 since F_{i,j} = -F_{j,i}. We can choose a coordinate system in which c is stationary at the origin, so m_1 q_1 + m_2 q_2 = 0. We can then eliminate one of the bodies' coordinates, or equivalently consider the difference vector q = q_1 - q_2. Its equation of motion is simply

\ddot{q} = - \mu \frac{q}{|q|^3}

where \mu = \frac{m_1 m_2}{m_1 + m_2} is the "reduced mass".

Johannes Kepler portrait
Kepler in 1610

The classical results of Newton and Kepler are that the trajectories of these equations lie on conics; in the bounded case they are on ellipses, with the center of mass at the focus, and so in polar coordinates

r(\theta) = \frac{a (1 - e^2)}{1 - e \cos{\theta - \theta_0}}

where the eccentricity is e = \sqrt{1 - \frac{b^2}{a^2}} and \theta, the true anomaly, is measured from the semi-major axis.

There are many treatments of this problem which use a variety of methods. Many expositions start by finding three conserved quantities: the energy

h = \frac{|v|^2}{2} - \frac{\mu }{|q|}

the angular momentum in the plane of motion

L = q_1 \dot{q_2} - \dot{q_1} q_2

and the less intuitive Hermann-Bernoulli-Laplace vector or as it is usually incorrectly called, the Laplace-Runge-Lenz vector

A = (v \cdot v) q - (q \cdot v) v - \frac{\mu q}{r}

With these conserved quantities it is possible to derive Kepler's laws, which for bounded orbits say that the trajectories are elliptic, with periods proportional to the cube of the semi-major axis length, and that equal areas are swept out in equal times (measured from the center of mass).

Here we use a very clever and insightful method developed by Jean-Marie Souriau , one of the founders of symplectic geometry in the 1970s. We follow a summary of this approach given by Richard Moeckel in his course notes for Math 8520 at the University of Minnesota. For the moment we will only consider the bounded solutions for which the energy is negative.

We first change coordinates using a modified time variable u, for which \frac{d u}{d t} = \frac{1}{r}, so \frac{d t}{d u} = r, where r = |q|. Using a prime to denote differentiation with respect to u, we have the system

q' = r v

and

v' = - \frac{\mu q}{r^2}.

It will be useful to also compute that r' = q \cdot v and q'' = (q \cdot v) v - \frac{\mu q}{r}. The energy

h = \frac{1}{2} |v|^2 - \frac{\mu}{r}

is still constant with respect to u.

We now consider the quantity Z = (t, q) where we regard t and q as functions of u. The first four derivatives of Z with respect to u are

Z' = (r, rv)
Z'' = (q \cdot v, (q \cdot v) v - \frac{\mu q}{r})
Z''' = (2 h r + \mu, 2 h r v) = 2 h Z′ + (\mu, 0)
Z'''' = 2 h Z''

This last equation is remarkably simple. Let us call Z'' = X and Z''' = Y, and then

X' = Y
Y' = 2 h X

so X is a harmonic oscillator with frequency w = \sqrt{-2 h} (recall we are assuming that h<0).

What a strange and pleasant surprise, to find a simple linear harmonic oscillator hiding within the very nonlinear Kepler problem. We now know that X and Y are vectors of sinusoidal functions

X = C_1 \cos(w t) + C_2 \sin(w t)
Y = - w C_1 \sin(w t) + w C_2 \cos(w t)

where C_1 and C_2 are constant vectors.

Because of the hybrid structure of X and Y it is convenient to break them back up into pieces X = (X_0, \hat{X}) and Y = (Y_0, \hat{Y}). Then we have relations

X_0 = q \cdot v, \ \ \ \ \hat{X} = (q \cdot v) v - \mu \frac{q}{r}
Y_0 = 2 h r + \mu , \ \ \ \ \hat{Y} = 2 h r v
h = \frac{1}{2} |v|^2 - \frac{\mu}{r}

These can be inverted to obtain

r = \frac{Y_0 - \mu}{2 h}, \ \ \ \ q = \frac{X_0 \hat{Y} - (Y_0-m)\hat{X}}{2 h \mu }, \ \ \ \ v = \frac{\hat{Y}}{Y_0 - \mu}

We can choose u(0) to be a point where q\cdot v = 0, in which case C_1 = (0, - \mu \frac{q_0}{r_0}) and w C_2 = (2 h r_0 + \mu, 2 h r_0 v_0). Using these we can simplify the r,q,v in terms of the initial conditions to get

r = \frac{\mu}{w^2} \left (1 - \frac{\mu - r_0 w^2}{\mu} \cos(w u) \right ) = a(1 - e \cos(E)

and

q = - a (e + \cos (E) ) \frac{q_0}{r_0} + b \sin(E) \frac{v_0}{|v_0|}

where we have used the more traditional variables for the ellipse with semi-axes a = \mu/w^2, b = a \sqrt{1 - a^2} and the eccentric anomaly E = w u.

Finally we can integrate t' = r to get Kepler's equation between the mean anomaly M = \frac{\mu}{w} t and E:

M = E - e \sin(E).

Kepler's Equation and a Musical Connection (FM Synthesis)

In its traditional form, Kepler's equation is usually written as

M = E - \epsilon \sin (E)

in which M is the so-called mean anomaly, a kind of averaged phase which is linear in time, and E is the eccentric anomaly, from which the angle \theta of the body relative to a focal point of an ellipse can be obtained. We would like to invert this equation to get E as an explicit function of time but unfortunately that seems impossible in closed form.

This equation is related to the modulated oscillators used in Frequency Modulated (FM) synthesis of musical sounds:

f = \frac{E - M}{\epsilon} , \ \ \ \ E = M + \epsilon f
f = \sin(E) = \sin (M + \epsilon f ) = \sin (w t + \epsilon f)

Using the Bessel functions

J_n(z) = \frac{1}{\pi} \int_0^{\pi} \cos( n \theta - z \sin(\theta)) \ d \theta

We can expand

\sin(w_1 t + \epsilon \sin ( w_2 t) ) = \sum_{n = - \infty}^{n = \infty} J_n(\epsilon) \sin( w t + n w_2 t)

and then

f = \sin(E) = \sum_{n=1}^{\infty} \frac{2 J_n(n \epsilon)}{n \epsilon} \sin (n w t)

Some graphs of Bessel functions are shown below:

Bessel functions
The first few Bessel functions

The Restricted Three-Body Problem

Because the celestial bodies of the solar system are so large, there are many situations of practical and theoretical interest in which we consider a body of mass so small in comparison that it is effectively zero. Examples of this include human-made satellites and spacecraft, comets, asteroids, and small moons.

If there are two massive bodies and one infinitesimal body, we are in the so-called restricted three-body problem.

Sitnikov's Problem

The Sitnikov problem is about a very special case of the restricted three-body problem, which despite its simplicity contains a horseshoe-like tangle having a conjugacy to the shift map. This is beautifully explained in Jurgen Moser's excellent book, Stable and Random Motions in Dynamical Systems, from which much of our exposition is derived.

z'' = \frac{-z}{(r^2 + z^2)^{3/2}}

where r=1 for the circular case, or r = 1 + \epsilon \cos(t) in the mildly elliptic case.

Central configurations

Already for the three-body problem there is not a really useful form of solution available, although we have many qualitative results. An important particular class that admits an explicit solution for the n-body problem is provided by the central configurations. Planar central configurations are the most important, as they are relative equilibria - equilibria in a uniformly rotating coordinate system.

Central configurations can be defined as those n-body configurations \{q_1, \ldots, q_n\} satisfying

(q_i - c) - \sum_{j = 1, j \neq i}^n \frac{m_j (q_i - q_j)}{r_{i,j}^3} = 0

where c is the center of mass.

Chaos in the Three-Body Problem

The three-body problem was long suspected of having chaotic dynamics, although this was not formally proved in a generic setting until Richard Moeckel's 1988 paper Chaotic Dynamics Near Triple Collision.