The Discrete Fourier and Cosine Transforms

For a myriad of things with some structure, there are repeating or cyclical elements. Examples include waves (of light, in air and water and other fluids), electric signals, images, weather, economic cycles, musical sounds, and many other phenomena. The Fourier Transform, along with a slew of variants, is the premier mathematical tool for analyzing such periodic structures.

As an example of one of the many uses of the Fourier transform, below are some plots of the amplitudes of frequencies of a few musical sounds.

Cello A4 frequency amplitudes
Cello (note A4) frequency amplitudes.
Female choir A4 frequency amplitudes
Female choir (note A4) frequency amplitudes.
Harp A4 frequency amplitudes
Harp (note A4) frequency amplitudes.

The Discrete Fourier Transform

The famous "Euler's Formula" will be particularly important in what follows:

e^{i \theta} = \cos(\theta) + i \sin(\theta)

The simplest way to define the Discrete Fourier Transform (DFT), is as a complex matrix transformation

y = Fx, where F is matrix of roots of unity divided by \sqrt{n}:

F_{i,j} = w^{i j}/ \sqrt{n}

for i,j \in \{0,1,\ldots n-1\}. Note we begin the indexing at 0, rather than 1, which turns out to be much more convenient in this setting.

The usual choice of a primitive Nth root of unity is the 'clockwise' w = e^{- i 2 \pi/N}.

Unitary lemma

The following fact is very useful in analyzing the Fourier transform: if w is a primitive nth root of unity and k is an integer, then

\sum_{j=0}^{n-1} w^{j k} = \left \{ \begin{array}{c} n \ \text{ if n divides k} \\ 0 \end{array} \right .

To prove this, we begin by using the factorization

0 = 1 - w^n = (1 - w)(1 + w + w^2 + \ldots w^{n-1})

so since w \neq 1,

(1 + w + w^2 + \ldots w^{n-1}) = 0

Examples

The 4-point DFT is particularly simple:

F_4 = \frac{1}{2} \left(\begin{array}{rrrr} 1 & 1 & 1 & 1 \\ 1 & w & w^{2} & w^{3} \\ 1 & w^{2} & w^{4} & w^{6} \\ 1 & w^{3} & w^{6} & w^{9} \end{array}\right) = \frac{1}{2} \left(\begin{array}{rrrr} 1 & 1 & 1 & 1 \\ 1 & w & w^{2} & w^{3} \\ 1 & w^{2} & w^{4} & w^{2} \\ 1 & w^{3} & w^{2} & w \end{array}\right) = \frac{1}{2}\left(\begin{array}{rrrr} 1 & 1 & 1 & 1 \\ 1 & -i & -1 & i \\ 1 & -1 & 1 & -1 \\ 1 & i & -1 & -i \end{array}\right)

In general, DFTs of powers of 2 are simpler to deal with, and many programs restrict to this case. Here is the Fourier matrix for N=8:

F_8 = \frac{1}{2 \sqrt{2}} \left(\begin{array}{rrrrrrrr} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & w & w^{2} & w^{3} & w^{4} & w^{5} & w^{6} & w^{7} \\ 1 & w^{2} & w^{4} & w^{6} & w^{8} & w^{10} & w^{12} & w^{14} \\ 1 & w^{3} & w^{6} & w^{9} & w^{12} & w^{15} & w^{18} & w^{21} \\ 1 & w^{4} & w^{8} & w^{12} & w^{16} & w^{20} & w^{24} & w^{28} \\ 1 & w^{5} & w^{10} & w^{15} & w^{20} & w^{25} & w^{30} & w^{35} \\ 1 & w^{6} & w^{12} & w^{18} & w^{24} & w^{30} & w^{36} & w^{42} \\ 1 & w^{7} & w^{14} & w^{21} & w^{28} & w^{35} & w^{42} & w^{49} \end{array}\right)
= \frac{1}{2 \sqrt{2}} \left(\begin{array}{rrrrrrrr} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & -\left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) & -i & -\left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) & -1 & \left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) & i & \left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) \\ 1 & -i & -1 & i & 1 & -i & -1 & i \\ 1 & -\left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) & i & -\left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) & -1 & \left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) & -i & \left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) \\ 1 & -1 & 1 & -1 & 1 & -1 & 1 & -1 \\ 1 & \left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) & -i & \left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) & -1 & -\left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) & i & -\left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) \\ 1 & i & -1 & -i & 1 & i & -1 & -i \\ 1 & \left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) & i & \left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) & -1 & -\left(\frac{\sqrt{2}}{2} i + \frac{\sqrt{2}}{2}\right) & -i & -\left(\frac{\sqrt{2}}{2} i - \frac{\sqrt{2}}{2}\right) \end{array}\right)

Shift and Circulant Diagonalization

There are many ways to think about the DFT. One of its properties is that the Fourier matrix F diagonalizes any shift permutation. Let P_R be the right shift permutation matrix for which if x = (x_0, x_1, \ldots, x_{N-1})^T, then P_R x = (x_{N-1}, x_0, x_1, \ldots, x_{N-2})^T. For example, for N=4 the matrix P_R is

P_R = \left(\begin{array}{rrrr} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{array}\right)

The characteristic equation of the N by N matrix P_R is \lambda^N - 1 = 0, so its eigenvalues are the Nth-roots of unity. The eigenvectors are any nonzero multiple of the columns of the Fourier matrix F, so F^{-1} P_R F is a diagonal matrix with the Nth roots of unity on the diagonal.

More generally, the matrix F will diagonalize any linear combination of powers of P_R. Such a linear combination is called a circulant matrix. For example, the matrix A below is a circulant matrix:

A = \left(\begin{array}{rrrr} 5 & -1 & 3 & 2 \\ 2 & 5 & -1 & 3 \\ 3 & 2 & 5 & -1 \\ -1 & 3 & 2 & 5 \end{array}\right)

and its diagonalization is

F^{-1} A F = \left(\begin{array}{cccc} 9 & 0 & 0 & 0 \\ 0 & 3 i + 2 & 0 & 0 \\ 0 & 0 & 7 & 0 \\ 0 & 0 & 0 & -3 i + 2 \end{array}\right)

Trigonometric interpolation

The Fourier transform can be used to construct trigonometric interpolants. For simplicity we will begin by using an artificial time scale, thinking of N input data points x = (x_0, x_1, \ldots x_{N-1}) as a time series (0,x_0), (1,x_1), \ldots, (N-1, x_{N-1}). Once we have the interpolating formula for this set up, we can rescale the time to match an arbitrary interval.

The Fourier matrix can be thought of as a Vandermonde matrix for the complex functions w^{kt} = e^{- 2 \pi k t/N}, for k = 0, 1, \ldots, N-1, at equally spaced timepoints t = 0, 1, \ldots, N-1. In this interpretation, we can use the Fourier output y = Fx = (y_0, y_1, \ldots, y_{N-1}) to construct a complex interpolant Z(t) = F^{-1}(t)y as:

Z(t) = \frac{1}{\sqrt{n}} \left ( y_0 + y_1 e^{2 \pi t/N} + y_2 e^{4 \pi t/N} + \ldots + y_{N-1} e^{2(N-1) \pi t/N} \right )

For real inputs x, we can recover a real trigonometric interpolant by using the real part of Z(t). This works, but (confusingly!) its not how most people use and think of the Fourier transform.

For real x, the transform y has the property that if we reflect the entries across the middle, they are complex conjugates: y_{i} = \overline{y_{N-i}} (except for y_0, which is real, and if N is even, y_{N/2} is also real). Also, at the timepoint t=j, w^{kj} = \overline{w^{(N-k)j}}, so for k>N/2 we can choose to think of the columns as coming from \overline{w^{(N-k)j}} instead of w^{kt}. This results in an interpolation that uses the lowest frequencies possible, which usually makes more sense in applications. If we write the output in terms of its real and imaginary parts, y_k = a_k + i b_k, then the lower-frequency interpolant formula is:

P_n(t) = \frac{a_0}{\sqrt{n}} + \frac{1}{\sqrt{n}}\sum_{k=1}^{(n/2)-1} \left ( 2 a_k \cos(\frac{2k\pi t}{n}) - 2 b_k \sin(\frac{2k\pi t}{n}) \right ) + \frac{a_{n/2}}{\sqrt{n}}\cos(\frac{n\pi t }{n})

We can make the affine transformation t \rightarrow n \frac{t - c}{d-c} to interpolate data that is evenly sampled on an interval [c,d) (starting at c) instead of 0,1,\ldots,N-1 \in [0,N) to get the more general form

P_n(t) = \frac{a_0}{\sqrt{n}} + \frac{1}{\sqrt{n}}\sum_{k=1}^{(n/2)-1} \left ( 2 a_k \cos(\frac{2k\pi(t-c)}{d-c}) - 2 b_k \sin(\frac{2k\pi(t-c)}{d-c}) \right ) + \frac{a_{n/2}}{\sqrt{n}}\cos(\frac{n\pi(t-c)}{d-c})

These two different versions of interpolation are shown below in an example with 8 data points, which were constructed by adding a small amount of noise to a cosine function with frequency 3. The green curve is the lower-frequency, more commonly used version (P_8(t)), the red curve is the higher-frequency Re(Z(t)).

Aliasing and trigonometric interpolation
Two different trigonometric interpolations of the same data. Blue line is linear interpolation.

The Fast Fourier Transform

For even N, the main idea is to reorder the input into the even-indexed and odd-indexed parts; on each of these parts, the transform can be decoupled into two pieces of size N/2. If N is a power of 2, i.e. N = 2^m, then we can repeat this m = \log_2(N) times, and the total operations needed to compute the transform are of order N \log(N), much much better than the full matrix multiplication which needs N^2 operations. The Fast Fourier Transform (FFT) is so efficient it is practically free for many purposes with current computer hardware.

FFT visualized
A visualization of the key step in the FFT for even N

This decomposition can be viewed as a kind of matrix factorization. For even N, we can write the Fourier matrix F_N as the product T H P, where T is a matrix of four diagonal blocks, H is a block-diagonal matrix with blocks of the half-sized transform F_{N/2}, and P is a permutation matrix separating out the even and odd components of the input vector. In the figure below, this decomposition is illustrated with colors instead of exact formula; zero is indicated by white, while the non-zero entries all lie on the unit circle in the complex plane, which is colored as indicated by the circle on the right.

FFT visualized

FFT for noise removal

Since noise tends to be more spectrally flat than most recorded sound (especially compared to tonal music), the Fourier transform can be used to remove some of the noise component. The noise spectrum needs to be modeled, which is usually done by analyzing a known background signal (i.e. a recording of "silence"). Only spectral components significantly above the noise background are kept, and the remaining components are inverted back into a time-domain signal.

Transforms for real data

Real FFT, with sine and cosine rows. General properties of orthogonal function transforms and least-squares.

In general, we can use a real orthogonal

A = \left(\begin{array}{rrrr} f_0(t_0) & f_0(t_1) & \ldots & f_0(t_{n-1}) \\ f_1(t_0) & f_1(t_1) & \ldots & f_1(t_{n-1})\\ \vdots & \vdots & \vdots & \vdots \\ f_{n-1}(t_0) & f_{n-1}(t_1) & \ldots & f_{n-1}(t_{n-1})\\ \end{array}\right)

...and then if y = Ax, then x = A^T y and then we get interpolating F(t) = \sum_{k=0}^{n-1} y_k f_k(t).

To get least-squares solution, we just use selected columns of A (selected rows from A^T).

For n even, real orthogonal matrix example

A = \sqrt{\frac{2}{n}}\left(\begin{array}{rrrr} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \ldots& \frac{1}{\sqrt{2}} \\ 1 & \cos(\frac{2 \pi}{n}) & \ldots & \cos(\frac{2 \pi (n-1)}{n}) \\ 0 & \sin(\frac{2 \pi}{n}) & \ldots & \sin(\frac{2 \pi (n-1)}{n}) \\ 1 & \cos(\frac{4 \pi}{n}) & \ldots & \cos(\frac{4 \pi (n-1)}{n}) \\ 0 & \sin(\frac{4 \pi}{n}) & \ldots & \sin(\frac{4 \pi (n-1)}{n}) \\ \vdots & \vdots & \vdots & \vdots \\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \cos(\pi) & \ldots &\frac{1}{\sqrt{2}} \cos(\pi (n-1)) \\ \end{array}\right)

Even better is the Discrete Cosine Transform,

C = \sqrt{\frac{2}{n}}\left(\begin{array}{rrrr} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \ldots& \frac{1}{\sqrt{2}} \\ \cos(\frac{\pi}{2 n}) & \cos(\frac{3 \pi}{2n}) & \ldots & \cos(\frac{(2n-1) \pi}{2n}) \\ \cos(\frac{2\pi}{2 n}) & \cos(\frac{6 \pi}{2n}) & \ldots & \cos(\frac{ 2 (2n-1) \pi}{2n}) \\ \vdots & \vdots & \vdots & \vdots \\ \cos(\frac{(n-1) \pi}{2 n}) & \cos(\frac{(n-1) 3 \pi}{2n}) & \ldots & \cos(\frac{(n-1) (2n-1) \pi}{2n}) \\ \end{array}\right)

The rows of C are the unit eigenvectors of the second-difference matrix

\Delta_2 = \left(\begin{array}{rrrrrr} 1 & -1 & & & & \\ -1 & 2 & -1 & & & \\ & -1 & 2 & -1 & & \\ & & \ddots & \ddots & \ddots & \\ & & & 1 & 2 & -1 \\ & & & & -1 & 1 \\ \end{array}\right)

Exercises

  1. Compute the DFT of x = (1,1,-1,-1) and use it to find a (real) interpolating trigonometric through the points (0,1), (1,1), (2,-1), (3,-1).

  2. Use the fact that \cos(x) = \frac{e^{ix} + e^{-ix}}{2} to compute all possible values of the following sum, in which j_1, j_2, and n are integers:

    \sum_{k=0}^{n-1} \cos \left ( \frac{2 \pi j_1 k}{n} \right ) \cos \left ( \frac{2 \pi j_2 k}{n} \right )
  3. Find a non-zero real vector x = (x_0, x_1, x_2, x_3)^T for which the discrete Fourier transform has a purely imaginary output. Is this unique (up to a scalar multiple)?

  4. Find a non-zero real vector x = (x_0, x_1, x_2, x_3)^T for which the discrete Fourier transform has a purely real output. Is this unique (up to a scalar multiple)?