WebSafe 3.7fncbook.com
|
|
🏠
Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Conditioning of linear systems

We are ready to consider the conditioning of solving the square linear system Ax=b\mathbf{A}\mathbf{x}=\mathbf{b}. In this problem, the data are A\mathbf{A} and b\mathbf{b}, and the solution is x\mathbf{x}. Both data and result are multidimensional, so we will use norms to measure their magnitudes.

The motivation for the definition of relative condition number in Chapter 1 was to quantify the response of the result to perturbations of the data. For simplicity, we start by allowing perturbations to b\mathbf{b} only while A\mathbf{A} remains fixed.

Let Ax=b\mathbf{A}\mathbf{x}=\mathbf{b} be perturbed to

A(x+h)=b+d. \mathbf{A}(\mathbf{x}+\mathbf{h}) = \mathbf{b}+\mathbf{d}.

The condition number should be the relative change in the solution divided by relative change in the data,

βˆ₯hβˆ₯βˆ₯xβˆ₯βˆ₯dβˆ₯βˆ₯bβˆ₯=βˆ₯hβˆ₯β€…β€Šβˆ₯bβˆ₯βˆ₯dβˆ₯β€…β€Šβˆ₯xβˆ₯. \frac{\quad\dfrac{\| \mathbf{h} \| }{\| \mathbf{x} \| }\quad}{\dfrac{\| \mathbf{d} \| }{\| \mathbf{b} \|}} = \frac{\| \mathbf{h} \|\;\| \mathbf{b} \| }{\| \mathbf{d} \|\; \| \mathbf{x} \| }.

We can bound βˆ₯hβˆ₯\| \mathbf{h} \| in terms of βˆ₯dβˆ₯\| \mathbf{d} \|:

Ax+Ah=b+d,Ah=d,h=Aβˆ’1d,βˆ₯hβˆ₯≀βˆ₯Aβˆ’1βˆ₯ βˆ₯dβˆ₯,\begin{split} \mathbf{A}\mathbf{x} + \mathbf{A} \mathbf{h} &= \mathbf{b} + \mathbf{d}, \\ \mathbf{A} \mathbf{h} &= \mathbf{d},\\ \mathbf{h} &= \mathbf{A}^{-1} \mathbf{d},\\ \| \mathbf{h} \| &\le \| \mathbf{A}^{-1}\| \,\| \mathbf{d} \|, \end{split}

where we have applied Ax=b\mathbf{A}\mathbf{x}=\mathbf{b} and (2.7.8). Since also b=Ax\mathbf{b}=\mathbf{A}\mathbf{x} implies βˆ₯bβˆ₯≀βˆ₯Aβˆ₯ βˆ₯xβˆ₯\| \mathbf{b} \|\le \| \mathbf{A} \|\, \| \mathbf{x} \|, we derive

βˆ₯hβˆ₯β€…β€Šβˆ₯bβˆ₯βˆ₯dβˆ₯β€…β€Šβˆ₯xβˆ₯≀(βˆ₯Aβˆ’1βˆ₯ βˆ₯dβˆ₯)(βˆ₯Aβˆ₯ βˆ₯xβˆ₯)βˆ₯dβˆ₯ βˆ₯xβˆ₯=βˆ₯Aβˆ’1βˆ₯ βˆ₯Aβˆ₯. \frac{\| \mathbf{h} \|\; \| \mathbf{b} \|}{\| \mathbf{d} \|\; \| \mathbf{x} \|} \le \frac{\bigl(\| \mathbf{A}^{-1} \|\, \| \mathbf{d} \|\bigr) \bigl(\| \mathbf{A} \|\,\| \mathbf{x} \|\bigr)}{\| \mathbf{d} \|\,\| \mathbf{x} \|} = \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|.

It is possible to show that this bound is tight, in the sense that the inequalities are in fact equalities for some choices of b\mathbf{b} and d\mathbf{d}. This result motivates a new definition.

2.8.1Main resultΒΆ

The matrix condition number (2.8.5) is equal to the condition number of solving a linear system of equations. Although we derived this fact above only for perturbations of b\mathbf{b}, a similar statement holds when A\mathbf{A} is perturbed.

Using a traditional Ξ”\Delta notation for the perturbation in a quantity, we can write the following.

Note that for any induced matrix norm,

1=βˆ₯Iβˆ₯=βˆ₯AAβˆ’1βˆ₯≀βˆ₯Aβˆ₯ βˆ₯Aβˆ’1βˆ₯=ΞΊ(A). 1 = \| \mathbf{I} \| = \| \mathbf{A} \mathbf{A}^{-1} \| \le \| \mathbf{A} \|\, \| \mathbf{A}^{-1} \| = \kappa(\mathbf{A}).

A condition number of 1 is the best we can hope forβ€”in that case, the relative perturbation of the solution has the same size as that of the data. A condition number of size 10t10^t indicates that in floating-point arithmetic, roughly tt digits are lost (i.e., become incorrect) in computing the solution x\mathbf{x}. And if ΞΊ(A)>Ο΅machβˆ’1\kappa(\mathbf{A}) > \epsilon_\text{mach}^{-1}, then for computational purposes the matrix is effectively singular.

2.8.2Residual and backward errorΒΆ

Suppose that Ax=b\mathbf{A}\mathbf{x}=\mathbf{b} and x~\tilde{\mathbf{x}} is a computed estimate of the solution x\mathbf{x}. The most natural quantity to study is the error, xβˆ’x~\mathbf{x}-\tilde{\mathbf{x}}. Normally we can’t compute it because we don’t know the exact solution. However, we can compute something related.

Obviously, a zero residual means that x~=x\tilde{\mathbf{x}}=\mathbf{x}, and we have the exact solution. What happens more generally? Note that Ax~=bβˆ’r\mathbf{A}\tilde{\mathbf{x}}=\mathbf{b}-\mathbf{r}. That is, x~\tilde{\mathbf{x}} solves the linear system problem for a right-hand side that is changed by βˆ’r-\mathbf{r}. This is precisely what is meant by backward error.

Hence residual and backward error are the same thing for a linear system. What is the connection to the (forward) error? We can reconnect with (2.8.6) by the definition h=x~βˆ’x\mathbf{h} = \tilde{\mathbf{x}}-\mathbf{x}, in which case

d=A(x+h)βˆ’b=Ah=βˆ’r.\mathbf{d} = \mathbf{A}(\mathbf{x}+\mathbf{h})-\mathbf{b}=\mathbf{A}\mathbf{h} = -\mathbf{r}.

Thus, (2.8.6) is equivalent to

βˆ₯xβˆ’x~βˆ₯βˆ₯xβˆ₯≀κ(A)βˆ₯rβˆ₯βˆ₯bβˆ₯. \frac{\| \mathbf{x}-\tilde{\mathbf{x}} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \mathbf{r} \|}{\| \mathbf{b} \|}.

Equation (2.8.11) says that the gap between relative error and the relative residual is a multiplication by the matrix condition number.

2.8.3ExercisesΒΆ