Differential Equations and Integrals

Numerical differentiation: forward and centered differences

A fundamental problem in numerical analysis is to estimate the derivative of a function. The usual definition of derivative for a function f:

$\frac{d f}{d x} = \lim_{h \rightarrow 0} \frac{f(x+h) - f(x)}{h}$

naturally leads to a method, the forward difference method, for a fixed $h$ :

$\frac{d f}{d x} \approx \frac{f(x+h) - f(x)}{h}$

Unfortunately this method is not very good and should be avoided where possible. The derivative is an intrinsically difficult quantity to approximate because the numerator involves the subtraction of near-identical numbers. This results in a loss of precision that is to some extent unavoidable, but we can do better at least than the forward difference.

We can derive better methods by using a power series expansion of the function $f$ at $x$ :

$f(x + h) = f(x) + f'(x) h + \frac{1}{2} f''(x) h^2 + \ldots$

(We are assuming the function $f$ has a power series which converges to $f$ up to $x+h$ at least.)

We will derive the centered difference approximation by optimizing a linear combination of function values in the numerator:

$\frac{d f}{d x} \approx \frac{A f(x-h) + B f(x) + C f(x + h)}{h}$

We replace $f(x + h)$ and $f(x - h)$ by their quadratic approximations

$f(x - h) \approx f(x) - f'(x) h + \frac{1}{2} f''(x) h^2$

$f(x + h) \approx f(x) + f'(x) h + \frac{1}{2} f''(x) h^2$

Now the numerator $A f(x-h) + B f(x) + C f(x + h)$ is approximately

$(A + B + C) f(x) + (-A + C) f'(x)h + (A + C)f''(x)h^2/2$

which we would like to equal $f'(x)h$ , so we need

$A + B + C = 0$

$-A + C = 1$

$A + C = 0$

which can be solved to get $A = -1/2$ , $B = 0$ , and $C = 1/2$ . (It is somewhat surprising that the best use of the $f(x)$ value is to throw it out.) This yields the centered difference approximation:

$\frac{d f}{d x} \approx \frac{f(x+h) - f(x -h)}{2 h}$

which has an error of order $h^2$ compared to the forward difference approximation's error of order $h$ .

The loss of precision from cancellations in the numerator means the error in floating-point computations will actually increase at some point as we decrease $h$ .

Error in forward difference for exp(x) — Error in forward difference (log-log) for the derivative of e^x at x=0.

Using the centered difference approximation decreases the error, but the effect of the precision loss occurs sooner. In the plot below, the log of error of the centered difference approximation is in red.

Error in forward and centered difference methods for exp(x) — Error in centered (red) and forward (blue) differences (log-log).

Considering the power series of the function not only lets us construct optimal estimates for the derivatives, it also provides a method to estimate when the rounding error will start to dominate the calculation. For example, for the centered difference formula the next term in the power series is a cubic term, equal to $\frac{h^2 f'''(c)}{6}$ for some value of $c \in [-h,h]$ . The rounding error will be approximately $\frac{\epsilon}{h}$ where $\epsilon$ is the machine-epsilon for whatever floating point method we are using. The overall error will then be approximately

$E(h) \approx \frac{\epsilon}{h} + \frac{h^2 f'''(c)}{6}$

This will have a minimum when $E' = 0$ . If we can bound the third derivative by $|f'''(c)| < M$ , then the value of the error minimizing occurs at $h \approx (3 \epsilon/M)^{1/3}$ .

Higher order methods will be more accurate and reach their most accurate value for larger values of $h$ compared to lower order methods.

Initial value problems and Euler's Method

Now we consider the related problem of numerically approximating the solution to a differential equation (or system of differential equations). For now we will focus on initial value problems (IVPs) in the standard form:

$y' = f(x,y), \ y(x_0) = y_0$

The simplest approximatin method is usually called Euler's Method, which relies on a tangent-line approximation to the solution at each approximation point $(x_i, y_i)$ to obtain the next point $(x_{i+1}, y_{i+1})$ :

$x_{i+1} = x_i + h$

$y_{i+1} = y_i + h f(x_i, y_i)$

The $x$ update simply expresses our use of a stepsize of $h$ , and will be common to all of the methods we consider. In many systems of differential equations the magnitude of the slope field strongly varies along some solution trajectories, and it makes sense to change the stepsize adaptively. For now we will assume the stepsize $h$ is fixed.

Higher-order methods

At every step Euler's method will be in error by an amount that is usually dominated by the magnitude of the derivative of its slope function along the solution - the second derivative of the solution $y(x)$ . At each step the error bound is proportional to $h^2$ ; over a fixed solution interval $[x_0, x_f]$ after taking $n$ steps of stepsize $h$ the total error bound is:

$|y(x_{f})-y_n| < M h$

for some constant $M$ that depends on $f(x,y)$ . So we expect the error to decrease linearly with $h$ , once $h$ becomes small.

Improved Euler's Method

The most commonly used second order method has many names, including the Improved Euler's method, Heun's method, and the Explicit Trapezoid method. It uses the average of two slopes; the first is the slope used in Euler's method, $k_1 = f_i = f(x_i, y_i)$ , and the second is the slope at the point we would have used in Euler's method for $y_{i+1}$ :

$k_1 = f_i = f(x_i, y_i)$

$k_2 = f(x_i + h, y_i + h k_1)$

$y_{i+1} = y_i + h \frac{k_1 + k_2}{2}$

4th Order Runge-Kutta

In the late 19th and early 20th century Carl Runge and Martin Kutta independently developed a family of efficient 1-step methods, now called Runge-Kutta methods. The fourth-order method provides a nice balance between complexity and order of convergence; it is good enough for most applications.

The 4th-order Runge-Kutta formulae for the ODE

$\frac{dy}{dx} = f(x,y)$ , going from $(x_i, y_i)$ to $(x_{i+1}, y_{i+1})$ with stepsize $h$ are:

$k_1 = f(x_i,y_i)$

$k_2 = f(x_i + h/2, y_i + h k_1/2)$

$k_3 = f(x_i + h/2), y_i + h k_2 /2)$

$k_4 = f(x_i + h, y_i + h k_3)$

And then:

$y_{i+1} = y_i + h (k_1 + 2 k_2 + 2 k_3 + k_4)/6$

$x_{i+1} = x_{i} + h$

Multistep methods

Adams-Bashforth two-step:

The Adams-Bashforth methods are explicit multistep methods which use interpolation to estimate an integral of slopes. First we recast the first-order ODE:

$y' = f(x,y)$

as an integral:

$y_{i+1} = y_{i} + \int_{x_i}^{x_{i+1}} f(x, y(x)) dx$

For the two-step method, the integrand $f(x,y(x))$ is approximated with an interpolating line based on the last two step values $f_i = f(x_i, y_i)$ and $f_{i-1} = f(x_{i-1}, y_{i-1})$ . This interpolating line is

$p(x) = f_i + \frac{f_i - f_{i-1}}{x_i - x_{i-1}}(x - x_i)$

If the approximation points are evenly spaces with $x_{i+1} = x_i + h$ , then the integrand simplifies considerably and we obtain the update algorithm:

$y_{i+1} = y_{i} + h(3 f_{i}- f_{i-1})/2$

Implicit and Predictor-Corrector Methods

Implicit methods benefit from requiring greater consistency between the solution and its slope, at the added cost of complexity. The simplest implicit method is the implicit Euler method, where we require:

$y_{i+1} = y_i + h f_{i+1}$

Similarly, the improved Euler method (also known as the explicit trapezoid method) has an implicit version:

$y_{i+1} = y_i + h (f_{i+1} + f_i)/2$

These methods have global truncation errors of order one and two, respectively, just like their explicit counterparts, but the error is usually lower in magnitude for a given step size.

The second order implicit (Adams-Moulton) method is:

$y_{i+2} = y_{i+1} + \frac{h}{12}(5 f_{i+2} + 8 f_{i+1} - f_i).$

The fourth order implicit (Adams-Moulton) method is more commonly used:

$y_{i+3} = y_{i+2} + \frac{h}{24}(9 f_{i+3} + 19 f_{i+2} - 5 f_{i+1} + f_i) .$

Such methods can be derived for any order, but in practice people mostly use methods of order 4 to 12.

Stormer-Verlet methods

Often in applications (particularly physical application) the differential equations are second-order, and when converted to a first-order system the variables are some sort of positions and velocities. The Stormer-Verlet methods take a Euler half-step to advance the velocity, update x, then re-update v.

Parasitic solutions and instability:

In designing ODE method we must avoid intrinsic instabilities. For example, the following method

$x_{i+2} = 2x_i - x_{i+1} + h(f_i + 5 f_{i+1})/2$

will tend to amplify any increasing components of a solution (exercise).

Quadrature and numerical integration

There is a rich literature of methods used to numerically evaluate a definite integral

$\int_a^b f(x) dx$

We will not cover these methods in detail in this text, because many of them can be obtained by simply applying methods for differential equations to the corresponding initial value problem

$y' = f(x), \ \ \ y(a) = 0$

Trapezoid and Simpson's Rules

The improved Euler's method reduces to the trapezoid method for integration, with $h = (b-a)/n$ :

$\int_a^b f(x) dx \approx h \left ( f(a)/2 + \sum_{k=1}^{n-1} f(a + kh) + \ + f(b) \right )$

which is why it is sometimes called the Explicit Trapezoid Method. Similarly, the fourth-order Runge-Kutta method reduces to Simpson's Method when used for integration.

Boundary value problems

In an introductory differential equations course you should have looked at a variety of initial value problems, but often boundary value problems (BVPs) are not covered. However BVPs do arise in many application, so we will briefly examine two of the major numerical methods used to solve them.

The Shooting Method

The shooting method for boundary value problems is conceptual pretty simple: we pick an initial value problem that overlaps with the known conditions on one boundary, and evolve it forward with one of the IVP methods previously discussed (e.g. the Runge-Kutta 4th order method). At the terminal boundary, we check how close we are to satisfying the constraint, and adjust the initial condition. If there is only one condition at the boundary then the 1-D root-finding methods such as bisection can be used to refine the initial conditions used. Otherwise, some sort of Newton's method or quasi-Newton's method such as the Broyden method can be used for higher dimensional problems.

Finite Difference Method

Partial Differentiation Equations

Partial differential equations (PDEs) are perhaps the most primary motivation for numerical analysis. They are useful in a huge variety of mathematical models. Unfortunately they are also difficult to solve, either analytically or numerically.

Linear PDEs are relatively tractable. There are three important categories of linear PDEs: parabolic, elliptic, and hyperbolic. These have quite different characters and preferred methods of solution, but are also all inter-related.

We will start with the canonical parabolic PDE, the heat equation, which was the subject of extremely important studies by Joseph Fourier in the early 1800s. The other two categories are exemplified by the standard wave equation (hyperbolic):

$\frac{\partial^2 u}{\partial t^2} = \frac{\partial^2 u}{\partial x^2}$

and the 2-D Laplace equation (elliptic):

$\frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2} = 0$

The Heat Equation

The 1-dimensional heat equation models the heat content $u(x,t)$ in a wire (or some other long narrow object), where $x$ is the spatial variable and $t$ is time. For a homogeneous wire, the PDE is

$\frac{\partial u}{\partial t} = D \frac{\partial^2 u}{\partial x^2}$

We can rescale the variables to put this in the standard form $\frac{\partial u}{\partial t} = \frac{\partial^2 u}{\partial x^2}$ .

Analytic solution to the heat equation using Fourier series

Since there is an elegant and complete analytic solution to the 1-D heat equation we will briefly review that before describing the numerical methods.

For a PDEs we must choose boundary conditions; since we are working in at least 2 dimensions of independent variables this choice is much wider than for ODEs. For the heat equation we will use boundary conditions which fix the temperatures at the spatial boundary - i.e. if we are working on the interval $x \in [x_0, x_f]$ then we choose temperatures $T_0$ and $T_f$ and for all times $t$ demand that $u(x_0,t) = T_0$ and $u(x_f,t) = T_f$ . Most of the important ideas for this type of boundary condition can be recycled for the solution of problems with other boundary conditions.

We also need to choose an initial temperature distribution at some $t=t_0$ , $u(x,t_0) = f(x)$ for $x \in (x_0, x_f)$ .

For convenience we will use $t_0 = 0$ , $x_0 = 0$ , and $x_f = 1$ ; the resulting solution can be rescaled and translated for any other choice of these values.

We can also assume we have homogeneous boundary conditions $T_0 = T_f = 0$ by subtracting off a steady-state solution $v(x)$ . Since it is time-independent, $v'' = 0$ , and it is a linear function $v = v_0 + v_1 t$ . To satisfy the boundary conditions $v(0) = T_0$ and $v(1) = T_1$ we need to choose $v_0 = T_0$ and $v_1 = T_1$ . Now the function $\tilde{u}(x,t) = u(x,t) - v(x)$ would have boundary conditions $\tilde{u}(0,t) = 0$ and $\tilde{u}(1,t) = 0$ . So there is no loss in generality in using $u(0,t) = u(1,t) = 0$ in what follows.

The next key idea for the analytic solution is to consider a product solution of the form $u_n(x,t) = A(x)B(t)$ for $t>0$ . To satisfy the standard heat equation we need $A(x)B'(t) = A''(x)B(t)$ . At points where $A(x)$ and $B(t)$ are nonzero we can rearrange this condition to

$\frac{A''(x)}{A(x)} = \frac{B'(t)}{B(t)} = \lambda$

where $\lambda$ is a constant that must be independent of $t$ and $x$ . This called the separation of variables technique, resulting in two ODEs for $A(x)$ and $B(t)$ . In this case the ODEs are linear with constant coefficients and have well-known solutions

$A(x) = C_1 \sin(\sqrt{-\lambda} x) + C_2 \cos(\sqrt{-\lambda} x)$

$B(t) = C_3 e^{\lambda t}$

where we have assumed with the benefit of hindsight that $\lambda$ is negative. If $\lambda$ is positive, then solution $A(x)$ is

$A(x) = C_1 e^{\sqrt{\lambda} t} + C_2 e^{-\sqrt{\lambda} t}$

but it turns out this cannot solve all of our boundary conditions since $u(0,t) = u(1,t) = 0$ forces $C_1 = C_2 = 0$ . This interaction of the choice of boundary conditions with the behavior of the solution is an important general feature of PDEs.

Returning to the solution

$A(x) = C_1 \sin(\sqrt{-\lambda} x) + C_2 \cos(\sqrt{-\lambda} x),$

since we need $A(0) = 0$ , we need $C_2 = 0$ . In addition we need $A(1) = 0$ , which puts a condition on $\lambda$ :

$\sqrt{-\lambda} = n \pi \ \ \implies \ \ \lambda = - n^2 \pi^2$

for some integer $n$ .

So now we have solutions of the form

$u_n(x,t) = C_n \sin(n \pi x) e^{-n^2 \pi^2 t}$

where we have renamed $C_1$ to $C_n$ to match the index of $u_n$ .

To complete the problem, we need to satisfy the boundary condition at $t=0$ , $u(x,0) = f(x)$ . Because the PDE is linear, we can use a sum of our product solutions $u_n(x,t)$ :

$u(x,t) = \sum_{n=1}^{\infty} C_n \sin(n \pi x) e^{-n^2 \pi^2 t}$

and choose the constants $C_n$ so that

$\sum_{n=1}^{\infty} C_n \sin(n \pi x) = f(x)$

if this is possible. Much of the initial work on Fourier series was related to this question. A full discussion of the answer(s) is outside the scope of this text (for an adequate summary see wikipedia); if $f(x)$ is $C^1$ (has a continuous derivative) on the interval $[0,1]$ then we can choose the $C_n$ so that the Fourier series converges to $f(x)$ for all $x$ in $[0,1]$ .

There is an excellent set of videos (1, 2) by Grant Sanderson which illustrates much of the above material.

Method of lines (partial discretization)

For the heat equation, and other similar PDEs which are first-order in time, we can leverage any of the numerical methods developed for ODEs by first discretizing the problem in the spatial dimension (or dimensions). As above, we will use as an example the standard heat equation $\frac{\partial u}{\partial t} = \frac{\partial^2 u}{\partial x^2}$ on the interval $x \in [0,1]$ with homogeneous boundary conditions $u(0,t) = 0$ , $u(1,t) = 0$ , $u(x,t) = f(x)$ .

We begin by choosing a uniform discretization $x_0 = 0$ , $x_i = h i$ , $\ldots$ , $x_n = 1$ of $n+1$ values for $x$ , evenly spaced by $\Delta x = h = 1/n$ . We use the centered difference approximation for $\frac{\partial^2 u}{\partial x^2}$ to a write a system of ODEs for the $n-1$ interior points $x_i$ (with $i \in \{ 1, \ldots, n-1 \}$ ):

$\frac{\partial^2 u}{\partial x^2} \approx \frac{u_{i+1,j} - 2 u_{i,j} + u_{i-1,j}}{h^2}$

where we enforce the boundary conditions with $u_0(t) = 0$ , $u_n(t) = 0$ , and $u_i(0) = f(x_i)$ .

We can now use any numerical ODE method we want on this system. The simplest choice, Euler's Method, uses the forward difference in time:

$\frac{\partial u_i}{\partial t} \approx \frac{u_{i,j+1} - u_{i,j}}{k}$

and now solving for $u$ at the later time yields what is called the Forward Difference Method for parabolic PDEs:

$u_{i,j+1} = u_{i,j} + k \frac{u_{i+1,j} - 2 u_{i,j} + u_{i-1,j}}{h^2}$

for which we are using a uniform stepsize in time of $\Delta t = k$ , and $u_{i,j}$ is the approximation to $u(x,t)$ at $x = x_i$ and $t = t_j = j k$ . If for each $j$ we consider the vector $\vec{u}_j = (u_{1,j}, u_{2,j}, \ldots, u_{n-1,j})$ and use $\sigma = \frac{k}{h^2}$ then this method can be written

$\vec{u}_{j+1} = A \vec{u}_j$

where $A$ is the matrix

$A = \left ( \begin{array}{c|c|c|c|c} 1 - 2\sigma & \sigma & 0 & \ldots & 0 \\ \sigma & 1 - 2\sigma & \sigma & \ddots & 0 \\ 0 & \sigma & 1 - 2\sigma & \ddots & 0 \\ \vdots & \ddots & \ddots & \ddots & \sigma \\ 0 & \ldots & 0 & \sigma & 1 - 2\sigma \\ \end{array}\right )$

Exercises

Use the Taylor (power series) expansion of a smooth function $f$ at $x=x_0$ to find the optimal $A$ , $B$ , and $C$ for the numeric differentiation formula

$\left . \frac{d f}{dx}\right |_{x=x_0} \approx \frac{Af(x_0) + Bf(x_0 + h) + Cf(x_0+2h)}{h}$

Solve (approximately) the boundary value problem $\displaystyle \frac{d^2 y}{dx^2} = 1 + x^2 y^2$ , $y(1) = 1$ , $y(2) = 2$ by using Euler's Method with stepsize $h=1/2$ on the equivalent first-order system

$\frac{d y}{dx} = v$

$\frac{d v}{dx} = 1 + x^2 y^2$

with initial conditions $y(1) = 1$ , $v(1) = v_0 = ?$

For the initial value problem $y' = 2 x y$ , $y(0) = 1$ , compare the following approximations to $y(1)$ (note the exact answer is $y(1) = e \approx 2.718281828459045$ .)

A. Euler's method with 2 steps.

B. The second order Taylor method

$y_{i+1} = y_i + h f(x_i, y_i) + \frac{h^2}{2} \left (\frac{\partial f}{\partial x}|_{x_i,y_i} + \frac{\partial f}{\partial y}|_{x_i,y_i} f|_{x_i,y_i} \right)$

with one step.

C. The Midpoint (Runge-Kutta order 2) method

$y_{i+1} = y_i + h f(x_i + h/2, y_i + \frac{h}{2} f(x_i, y_i))$

with one step.
For the initial value problem $y' = 2xy$ , $y(0) = 1$ , compare the following approximations to $y(1)$ (note the exact answer is $y(1) = e \approx 2.718281828459045$ .)

A. Richardson extrapolation, using the Euler method with 2 and 4 steps.

B. The Runge-Kutta 4th order method with 1 step.
For the initial value problem $y' = 2 x y$ , $y(0) = 1$ , approximate the solution $y(1) = e$ by the following predictor-corrector method:

A. First compute $y_1 \approx y(1/2)$ using the explicit trapezoid method

$k_1 = f(x_i,y_i), \ \ \ k_2 = f(x_i + h, y_i + h k_1),$

$y_{i+1} = y_i + h (k_1 + k_2)/2$

B. Next compute the 'predictor' value from the 2nd-order Adams-Bashforth method

$z_2 = y_1 + h (\frac{3}{2}f_1 - \frac{1}{2}f_0)$

where $f_i = f(x_i, y_i)$ .

C. Finally, estimate $y_2$ by the 'corrector' Adams-Moulton method using $z_2$ as an approximation for $y_2$ in $f_2 = f(x_2, y_2) \approx f(x_2, z_2)$ .

$y_2 = y_1 + \frac{h}{12}(5f_2 + 8 f_1 - f_0).$
Consider the 1-dimensional heat equation $\displaystyle \frac{\partial u}{\partial t} = D \frac{\partial^2 u}{\partial x^2}$ with boundary conditions $u(t,0) = 0$ , $u(t,2h) = 2$ , and $u(0,h) = 4$ . If we discretize the system in time with $\Delta t = k$ and $\Delta x = h$ , using the forward difference approximation for $\displaystyle \frac{\partial u}{\partial t}$ and the centered second-difference approximation for $\displaystyle \frac{\partial^2 u}{\partial x^2}$ , for what values of $k>0$ and $h>0$ will the solution converge (as $t \rightarrow \infty$ , the temperature $u(t,h)$ should approach $1$ ). Note that since $\Delta x = h$ , only the value $u(t_i, h)$ needs to be computed for each time $t_i$ .