Matrix Norms and Linear Systems

Vector norms

A vector norm is a generalization of length. The first formalization of length was what we now call the Euclidean norm. In two dimensions the Euclidean norm of a vector x = (x_1, x_2) is |x|_2 = \sqrt{x_1^2 + x_2^2}. Usually if someone uses the term 'length' with no other explanation, they mean the Euclidean norm. Abstracting the key properties of length into the generalization of vector norms gives us some additional freedom which can be useful in proofs and applications.

The defining properties of a vector norm are:

  1. ||x|| \ge 0, with ||x||=0 only if x=0.

  2. ||x + y|| \le ||x|| + ||y||

  3. || c x || = |c| ||x|| for scalar c.

The second property is called the triangle inequality, and requiring it for norms preserves a lot of our intuition about the idea of lengths.

The standard vector norms (p-norms)

For a vector x = (x_1, x_2, \ldots, x_n), the most commonly used norms are in the family defined by:

|x|_p = \left ( \sum_i |x_i|^p \right )^{1/p}

where p \in [1, \infty]. The most important particular norms in this family are when p=1, p=2 (the Euclidean norm), and p=\infty. The p=\infty is defined by the limit for p \rightarrow \infty, and is also called the max norm because in the limit we have

|x|_{\infty} = \max(|x_i|)

So for example the vector x = (1,2,-3) has a 1-norm of |x|_1 = 1 + 2 + 3 = 6, a 2-norm of |x|_2 = \sqrt(6), and a \infty-norm of |x|_{\infty} = 3.

P-norm unit circles
Unit "circles" for various p-norms.

Induced vector-matrix norms

For matrices, we usually use the induced norms from vector norms, with

||A||_p = \max_{||x||_p = 1} ||Ax||_p

These matrix norms gauge the 'size' of a matrix by the maximum amount that they stretch a vector when acting on it from the left.

One of the desirable properties that any induced matrix norm has is the sub-multiplicative property:

|A B| \le |A| |B|

Example:

Let us compute the induced 1,2,\infty-norms for

A = \left ( \begin{array}{rr} 1 & 2 \\ 0 & -2 \end{array} \right )

The induced 1-norm is the maximum of the sum of absolute values of the entries in each column. For this matrix, the second column has a sum of 2 + 2 = 4, so |A|_1 = 4.

The induced \infty-norm is also easy to calculate - it is just the maximum of the sum of absolute values of each row's entries. For this matrix, the first row has the largest sum, so |A|_{\infty} = 1 + 2 =3.

The induced 2-norm is more difficult to compute. One relatively simple way to compute it is through the formula |A|_2 = \max \sqrt{\lambda_i} where the \lambda_i are the eigenvalue of A^T A (or A A^T). As we will see in the section on the Singular Value Decomposition (SVD), the matrix A^T A is symmetric and always diagonalizable with an orthogonal basis of eigenvectors, so there are no major problems in computing its eigenvalues.

The Frobenius Norm

While we usually want to use induced matrix norms, there are some situtations where other norms are convenient. The most important of these is the Frobenius norm, which simply treats A as a vector:

||A||_F = \sqrt{\sum_{i,j} a_{i,j}^2} = \sqrt{tr(A^*A)}

although it can also be expressed in the more obviously invariant forms:

||A||_F = \sqrt{tr(A^* A)} = \sqrt {\sum _{i=1}^{\min\{m,n\}}\sigma _{i}^{2}(A)}

where tr denotes the trace of the matrix and the \sigma_i are the singular values.

The Frobenius norm is sub-multiplicative ( |A B|_F \le |A|_F |B|_F ).

Equivalence of norms

Any two norms are bounded relative to each other, in that there are always constants C_1, C_2 (which depend on the norms) such that for norms a and b

C_1 |A|_a \le |A|_b \le C_2 |A|_a

For example, for some of the common p-norms we have

|x|_{\infty} \le |x|_2 \le \sqrt{n} |x|_{\infty}

and

|x|_{\infty} \le |x|_1 \le n |x|_{\infty}

Condition numbers

The (relative) condition number for f(x) from x is the ratio of the relative forward error to the relative backward error.

\kappa = \frac{|\delta f|/|f|}{|\delta x|/|x|} \approx \frac{|x| |Df|}{|f|}

For linear problems, such as solving for x in Ax=b, this ratio of errors is bounded by \kappa(A) = |A| |A^{-1}|, which is often called the condition number of A. It does depend on the norm used; usually it is assumed to be the induced Euclidean norm (induced 2-norm).

Example: condition number for Hilbert matrices

The Hilbert matrices are an infamous example of a family of matrices with rapidly increasing condition number. The entries of a n by n Hilbert matrix are defined as H_{i,j} = \frac{1}{i+j-1} (if we start indexing the rows and columns at 1, so i,j \in \{ 1, \ldots, n\}).

For example, for n=4 the Hilbert matrix is

H = \left(\begin{array}{rrrr} 1 & \frac{1}{2} & \frac{1}{3} & \frac{1}{4} \\ \frac{1}{2} & \frac{1}{3} & \frac{1}{4} & \frac{1}{5} \\ \frac{1}{3} & \frac{1}{4} & \frac{1}{5} & \frac{1}{6} \\ \frac{1}{4} & \frac{1}{5} & \frac{1}{6} & \frac{1}{7} \end{array}\right)

which already has a 1-norm condition number of about 28,375. For the 20 by 20 Hilbert matrix the condition number is about 3.4 \cdot 10^{19}, and it is easy to lose all meaningful precision if using standard 64-bit floating point numbers with such a matrix.

Direct Linear Methods

The most elementary problem in linear algebra is to solve a linear system

Ax = b

where A is a square (n by n) matrix, x is an unknown vector, and b is a given vector. Solution techniques for relatively small (3 by 3 or so) systems have been known for thousands of years, for example in the Chinese text Nine Chapters on the Mathematical Art. Newton described a more systematic method in the late 17th century, which is more or less what many people call Gaussian Elimination.

The method you are most likely to have learned in an introductory linear algebra course is often called Gauss-Jordan elimination. In this method, the augmented coefficient matrix (A|b) is converted to its reduced row-echelon form (rref) using the three fundamental row operations: (1) rescaling a row by a nonzero constant, (2) interchanging (swapping) two rows, and (3) adding a multiple of row to another row.

Gaussian Elimination and the LU factorization

Often in applications the matrix A is fixed but the vector b is changed. In this case it is more efficient to compute a factorization of A which can be used to efficiently solve the system for any b. The factorization is usually written as PA = LU, although it could also be written as A = P^{-1} L U. Here P is a permutation matrix (so PA has the same rows as A, but possibly in a different order), L is a lower-triangular matrix, and U is upper-triangular.

Pivoting

The basic LU factorization cannot succeed in some cases because of the presence of zero entries, for example the matrix

A = \left ( \begin{array}{rr} 0 & 1 \\ 1 & 1 \end{array} \right )

immediately causes a problem since we cannot eliminate the (2,1) entry using a multiple of the (1,1) entry since it is zero.

More subtly, using multiples of small entries for elimination can lead to a serious loss of precision.

Example: loss of precision without pivoting

Consider the system Ax=b where

A = \left ( \begin{array}{cc} 10^{-20} & 1 \\ 1 & 1 \end{array} \right )

and b = (1,0). If we eliminate the (2,1) entry by adding -10^{20} times row 1 to row 2, we would obtain the decomposition

A = \left ( \begin{array}{cc} 1 & 0 \\ 10^{20} & 1 \end{array} \right ) \left ( \begin{array}{cc} 10^{-20} & 1 \\ 0 & -10^{20}+1 \end{array} \right )

Example PA=LU decomposition

Lets look at a particular example of a PA=LU decomposition for a 3 by 3 matrix A:

A = \left ( \begin{array}{rrr} 2 & 1 & 5 \\ 4 & 4 & -4 \\ 1 & 3 & 1 \end{array} \right )

With partial pivoting, we want to move the A_{2,1} = 4 to the upper left corner. We can do this with a permutation matrix

P = \left ( \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right )

so now we have

P A = \left ( \begin{array}{rrr} 4 & 4 & -4 \\ 2 & 1 & 5 \\ 1 & 3 & 1 \end{array} \right )

Next we can add -1/2 row 1 to row 2, and -1/4 row 1 to row 3, to obtain two zeros in the first column. At that point we have:

P A = \left ( \begin{array}{rrr} 1 & 0 & 0 \\ 1/2 & 1 & 0 \\ 1/4 & 0 & 1 \end{array} \right ) \left ( \begin{array}{rrr} 4 & 4 & -4 \\ 0 & -1 & 7 \\ 0 & 2 & 2 \end{array} \right )

Since bottom row has a larger pivot value than the second row, we swap rows 2 and 3. This means our permutation matrix is now

P = \left ( \begin{array}{rrr} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{array} \right )

and

P A = \left ( \begin{array}{rrr} 1 & 0 & 0 \\ 1/4 & 1 & 0 \\ 1/2 & 0 & 1 \end{array} \right ) \left ( \begin{array}{rrr} 4 & 4 & -4 \\ 0 & 2 & 2 \\ 0 & -1 & 7 \end{array} \right )

Note that in the lower-triangular matrix we only swap the lower-triangular entries.

Finally, we need to add 1/2 of row 2 to row 3. This gives us

PA =\left ( \begin{array}{rrr} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{array} \right ) \left ( \begin{array}{rrr} 2 & 1 & 5 \\ 4 & 4 & -4 \\ 1 & 3 & 1 \end{array} \right ) = \left ( \begin{array}{rrr} 1 & 0 & 0 \\ 1/4 & 1 & 0 \\ 1/2 & -1/2 & 1 \end{array} \right ) \left ( \begin{array}{rrr} 4 & 4 & -4 \\ 0 & 2 & 2 \\ 0 & 0 & 8 \end{array} \right ) = LU

Basic example

\left ( \begin{array}{cccc} 2 & 1 & 1 & 0 \\ 4 & 3 & 3 & 1 \\ 8 & 7 & 9 & 5 \\ 6 & 7 & 9 & 8 \end{array} \right ) = \left ( \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 0 \\ 4 & 3 & 1 & 0 \\ 3 & 4 & 1 & 1 \end{array} \right ) \left ( \begin{array}{cccc} 2 & 1 & 1 & 0 \\ 0 & 1 & 1 & 1 \\ 0 & 0 & 2 & 2 \\ 0 & 0 & 0 & 2 \end{array} \right )

Example of pathological error in Gaussian elimination. (Trefethen and Bau.)

Some matrices cause problems even with partial pivoting. A worst-case example is the following:

A = \left ( \begin{array}{rrrrr} 1 & 0 & 0 & 0 & 1 \\ -1 & 1 & 0 & 0 & 1 \\ -1 & -1 & 1 & 0 & 1 \\ -1 & -1 & -1 & 1 & 1 \\ -1 & -1 & -1 & -1 & 1 \end{array} \right )

which has the LU decomposition

A = \left(\begin{array}{rrrrr} 1 & 0 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 & 0 \\ -1 & -1 & 1 & 0 & 0 \\ -1 & -1 & -1 & 1 & 0 \\ -1 & -1 & -1 & -1 & 1 \end{array}\right) \left(\begin{array}{rrrrr} 1 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 2 \\ 0 & 0 & 1 & 0 & 4 \\ 0 & 0 & 0 & 1 & 8 \\ 0 & 0 & 0 & 0 & 16 \end{array}\right)

The exponential growth of the entries in the right-most column cause numerical precision issues.

Partial pivoting algorithm in pseudocode

U=A, L = I, P = I
for k=1 to m-1:
    Select i>=k to maximize |u_{ik}|
    interchange row u_k with u_i, l_k with l_i, p_k with p_i
    for j=k+1 to m:
        l_{jk} = u_{jk}/u_{kk}
        u_{j,k:m} = u_{j,k:m} - l_{jk} u_{k,k:m}

Operation counts: about \frac{2}{3}n^3 for n by n.

Indirect (iterative) linear methods

Seidel
Philipp Ludwig von Seidel.

For sparse, large, and related series of linear problems indirect/iterative methods can be vastly superior to direct methods such as the PA=LU decomposition.

The simplest iterative method is the Jacobi method. The idea of this method is to split a matrix into its diagonal and off-diagonal parts: A = D + R. Then the equation Ax = b can be put into the iterative form

x_{k+1} = D^{-1}(b - Rx_k)

This converges if D^{-1}R = I - D^{-1}A has spectral radius less than 1.


Example of Jacobi Method

Lets see how the Jacobi method works on a small example, where

A = \left(\begin{array}{rr} 2 & 1 \\ 1 & 3 \end{array}\right)

and b=(1,1)^T.

The exact value of x solving Ax=b is (\frac{2}{5},\frac{1}{5})^T. Suppose we start with a guess of x_0 = (1,1)^T. At each iteration step we will need

D^{-1} b = \left(\begin{array}{rr} 1/2 & 0 \\ 0 & 1/3 \end{array}\right) \left(\begin{array}{r} 1 \\ 1 \end{array}\right) = \left(\begin{array}{r} 1/2 \\ 1/3 \end{array}\right)

and

D^{-1} A = \left(\begin{array}{rr} 1/2 & 0 \\ 0 & 1/3 \end{array}\right) \left(\begin{array}{rr} 2 & 1 \\ 1 & 3 \end{array}\right) = \left(\begin{array}{rr} 1 & \frac{1}{2} \\ \frac{1}{3} & 1 \end{array}\right)

The iteration step is

x_{k+1} = D^{-1}(b - Rx_k) = \left(\begin{array}{r} 1/2 \\ 1/3 \end{array}\right) - \left(\begin{array}{rr} 1 & \frac{1}{2} \\ \frac{1}{3} & 1 \end{array}\right) \left(\begin{array}{r} x_{k,1} \\ x_{k,2} \end{array}\right)

The first few iterations yield x_1 = (0,0)^T, x_2 = (1/2, 1/3)^T, and after ten steps

x_{10} = \left(\begin{array}{r} 1037/2592 \\ 389/1944 \end{array}\right) \approx \left(\begin{array}{r} 0.40008 \\ 0.2001 \end{array}\right)

which has a relative error of about 0.0003. For this example, the Jacobi method is ridiculously inefficient, but this is the opposite extreme from the large sparse matrix case for which it can be acceptable.


In the Gauss-Seidel method, the matrix is split into a an upper-triangular and a lower-triangular part: A = (D + L) + U; the lower triangular part has a non-zero diagonal, while U has all zeros on its diagonal. Then we use the iterative scheme:

(D+L) x_{k+1} = (b - Ux_k)

which can be efficiently solved by forward substitution without explicitly inverting the matrix (D+L).

For the previous example, with A = \left(\begin{array}{rr} 2 & 1 \\ 1 & 3 \end{array}\right), the Gauss-Seidel method has a relative error of about 10^{-8} after ten iterations - much better than the Jacobi method.

Successive Over-Relaxation

For some problems convergence can be improved by moving a little more in the direction of the updated solution. The simplest version of this type of method is to use the Gauss-Seidel method with a weight w for the old and new estimates, i.e.

(D+L) x_{k+1} = (b - Ux_k)
x_{k+1} = (1-w) x_k + w x_{k+1}

These two steps can be combined into one for a more efficient implementation.

When w > 1 these methods are called Successive Over-Relaxation (SOR) methods.

Example of SOR

Lets look at what happens if we apply the Gauss-Seidel method with SOR to the problem Ax = b where

A = \left(\begin{array}{rrrrrrr} 3 & -\frac{1}{2} & 0 & 0 & 0 & 0 & \frac{1}{2} \\ -1 & 3 & -\frac{1}{2} & 0 & 0 & \frac{1}{2} & 0 \\ 0 & -1 & 3 & -1 & 0 & 0 & 0 \\ 0 & 0 & -1 & 3 & -1 & 0 & 0 \\ 0 & 0 & 0 & -1 & 3 & -1 & 0 \\ 0 & 0 & 0 & 0 & -1 & 3 & -1 \\ 0 & 0 & -1 & 3 & -1 & 0 & 0 \\ 0 & 0 & 0 & -1 & 3 & -1 & 0 \\ 0 & 0 & 0 & 0 & 0 & -1 & 3 \end{array}\right), \ \ \ \ b = \left(\begin{array}{r} 3 \\ 2 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 2 \end{array}\right)

The number of iterates until convergence depends on the value of w as shown below:

SOR example
Number of iterates until convergence as a function of w.

We can see that for this particular problem the optimal weight is around w=1.12.

Optimal weight for SOR

In general there is no formula for the weight used in SOR, but there are a variety of special cases that have been studied. One approach is to perform some number of iterations with w=1, and then switch to a new w_o given at the kth iteration by

w_o = \frac{2}{1 + \sqrt{1 - (\frac{\Delta x^{(k+p)}}{\Delta x^{k}})^{(1/p)}}}

where p is a small positive integer.

Exercises