Conditioning of linear systems Tobin A. Driscoll Richard J. Braun
We are ready to consider the conditioning of solving the square linear system A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b . In this problem, the data are A \mathbf{A} A and b \mathbf{b} b , and the solution is x \mathbf{x} x . Both data and result are multidimensional, so we will use norms to measure their magnitudes.
The motivation for the definition of relative condition number in Chapter 1 was to quantify the response of the result to perturbations of the data. For simplicity, we start by allowing perturbations to b \mathbf{b} b only while A \mathbf{A} A remains fixed.
Let A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b be perturbed to
A ( x + h ) = b + d . \mathbf{A}(\mathbf{x}+\mathbf{h}) = \mathbf{b}+\mathbf{d}. A ( x + h ) = b + d . The condition number should be the relative change in the solution divided by relative change in the data,
β₯ h β₯ β₯ x β₯ β₯ d β₯ β₯ b β₯ = β₯ h β₯ β
β β₯ b β₯ β₯ d β₯ β
β β₯ x β₯ . \frac{\quad\dfrac{\| \mathbf{h} \| }{\| \mathbf{x} \| }\quad}{\dfrac{\| \mathbf{d} \| }{\| \mathbf{b} \|}}
= \frac{\| \mathbf{h} \|\;\| \mathbf{b} \| }{\| \mathbf{d} \|\; \| \mathbf{x} \| }. β₯ b β₯ β₯ d β₯ β β₯ x β₯ β₯ h β₯ β β = β₯ d β₯ β₯ x β₯ β₯ h β₯ β₯ b β₯ β . We can bound β₯ h β₯ \| \mathbf{h} \| β₯ h β₯ in terms of β₯ d β₯ \| \mathbf{d} \| β₯ d β₯ :
A x + A h = b + d , A h = d , h = A β 1 d , β₯ h β₯ β€ β₯ A β 1 β₯ β β₯ d β₯ , \begin{split}
\mathbf{A}\mathbf{x} + \mathbf{A} \mathbf{h} &= \mathbf{b} + \mathbf{d}, \\
\mathbf{A} \mathbf{h} &= \mathbf{d},\\
\mathbf{h} &= \mathbf{A}^{-1} \mathbf{d},\\
\| \mathbf{h} \| &\le \| \mathbf{A}^{-1}\| \,\| \mathbf{d} \|,
\end{split} Ax + Ah Ah h β₯ h β₯ β = b + d , = d , = A β 1 d , β€ β₯ A β 1 β₯ β₯ d β₯ , β where we have applied A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b and (2.7.8) .
Since also b = A x \mathbf{b}=\mathbf{A}\mathbf{x} b = Ax implies β₯ b β₯ β€ β₯ A β₯ β β₯ x β₯ \| \mathbf{b} \|\le
\| \mathbf{A} \|\, \| \mathbf{x} \| β₯ b β₯ β€ β₯ A β₯ β₯ x β₯ , we derive
β₯ h β₯ β
β β₯ b β₯ β₯ d β₯ β
β β₯ x β₯ β€ ( β₯ A β 1 β₯ β β₯ d β₯ ) ( β₯ A β₯ β β₯ x β₯ ) β₯ d β₯ β β₯ x β₯ = β₯ A β 1 β₯ β β₯ A β₯ . \frac{\| \mathbf{h} \|\; \| \mathbf{b} \|}{\| \mathbf{d} \|\; \| \mathbf{x} \|}
\le \frac{\bigl(\| \mathbf{A}^{-1} \|\, \| \mathbf{d} \|\bigr)
\bigl(\| \mathbf{A} \|\,\| \mathbf{x} \|\bigr)}{\| \mathbf{d} \|\,\| \mathbf{x} \|}
= \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|. β₯ d β₯ β₯ x β₯ β₯ h β₯ β₯ b β₯ β β€ β₯ d β₯ β₯ x β₯ ( β₯ A β 1 β₯ β₯ d β₯ ) ( β₯ A β₯ β₯ x β₯ ) β = β₯ A β 1 β₯ β₯ A β₯. It is possible to show that this bound is tight, in the sense that the inequalities are in fact equalities for some choices of b \mathbf{b} b and d \mathbf{d} d . This result motivates a new definition.
The matrix condition number of an invertible square matrix A \mathbf{A} A is
ΞΊ ( A ) = β₯ A β 1 β₯ β β₯ A β₯ . \kappa(\mathbf{A}) = \| \mathbf{A}^{-1}\| \, \| \mathbf{A} \|. ΞΊ ( A ) = β₯ A β 1 β₯ β₯ A β₯. This value depends on the choice of norm; a subscript on ΞΊ \kappa ΞΊ such as 1, 2, or β \infty β is used if clarification is needed. If A \mathbf{A} A is singular, we define ΞΊ ( A ) = β \kappa(\mathbf{A}) = \infty ΞΊ ( A ) = β .
2.8.1 Main result ΒΆ The matrix condition number (2.8.5) is equal to the condition number of solving a linear system of equations. Although we derived this fact above only for perturbations of b \mathbf{b} b , a similar statement holds when A \mathbf{A} A is perturbed.
Using a traditional Ξ \Delta Ξ notation for the perturbation in a quantity, we can write the following.
If A ( x + Ξ x ) = b + Ξ b \mathbf{A}(\mathbf{x} + \Delta \mathbf{x}) = \mathbf{b} + \Delta \mathbf{b} A ( x + Ξ x ) = b + Ξ b , then
β₯ Ξ x β₯ β₯ x β₯ β€ ΞΊ ( A ) β₯ Ξ b β₯ β₯ b β₯ . \frac{\| \Delta \mathbf{x} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \Delta \mathbf{b} \|}{\| \mathbf{b} \|}. β₯ x β₯ β₯Ξ x β₯ β β€ ΞΊ ( A ) β₯ b β₯ β₯Ξ b β₯ β . If ( A + Ξ A ) ( x + Ξ x ) = b (\mathbf{A}+\Delta \mathbf{A}) (\mathbf{x} + \Delta \mathbf{x}) = \mathbf{b} ( A + Ξ A ) ( x + Ξ x ) = b , then
β₯ Ξ x β₯ β₯ x β₯ β€ ΞΊ ( A ) β₯ Ξ A β₯ β₯ A β₯ , \frac{\| \Delta \mathbf{x} \|}{\| \mathbf{x} \|} \le \kappa(\mathbf{A}) \frac{\| \Delta \mathbf{A} \|}{\| \mathbf{A} \|}, β₯ x β₯ β₯Ξ x β₯ β β€ ΞΊ ( A ) β₯ A β₯ β₯Ξ A β₯ β , in the limit β₯ Ξ A β₯ β 0 \| \Delta \mathbf{A} \| \to 0 β₯Ξ A β₯ β 0 .
Note that for any induced matrix norm,
1 = β₯ I β₯ = β₯ A A β 1 β₯ β€ β₯ A β₯ β β₯ A β 1 β₯ = ΞΊ ( A ) . 1 = \| \mathbf{I} \| = \| \mathbf{A} \mathbf{A}^{-1} \| \le \| \mathbf{A} \|\, \| \mathbf{A}^{-1} \| = \kappa(\mathbf{A}). 1 = β₯ I β₯ = β₯ A A β 1 β₯ β€ β₯ A β₯ β₯ A β 1 β₯ = ΞΊ ( A ) . A condition number of 1 is the best we can hope forβin that case, the relative perturbation of the solution has the same size as that of the data. A condition number of size 1 0 t 10^t 1 0 t indicates that in floating-point arithmetic, roughly t t t digits are lost (i.e., become incorrect) in computing the solution x \mathbf{x} x . And if ΞΊ ( A ) > Ο΅ mach β 1 \kappa(\mathbf{A}) > \epsilon_\text{mach}^{-1} ΞΊ ( A ) > Ο΅ mach β 1 β , then for computational purposes the matrix is effectively singular.
Julia has a function cond to compute matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the 6 Γ 6 6\times 6 6 Γ 6 case.
Type \kappa followed by Tab to get the Greek letter ΞΊ \kappa ΞΊ .
A = [ 1 / (i + j) for i in 1:6, j in 1:6 ]
ΞΊ = cond(A)Because ΞΊ β 1 0 8 \kappa\approx 10^8 ΞΊ β 1 0 8 , itβs possible to lose nearly 8 digits of accuracy in the process of passing from A \mathbf{A} A and b \mathbf{b} b to x \mathbf{x} x . That fact is independent of the algorithm; itβs inevitable once the data are expressed in finite precision.
Letβs engineer a linear system problem to observe the effect of a perturbation. We will make sure we know the exact answer.
6-element Vector{Float64}:
4.407142857142857
3.564285714285714
3.013095238095238
2.6174603174603175
2.317279942279942
2.0807359307359308
Now we perturb the system matrix and vector randomly by 10-10 in norm.
# type \Delta then Tab to get Ξ
ΞA = randn(size(A)); ΞA = 1e-10 * (ΞA / opnorm(ΞA));
Ξb = randn(size(b)); Ξb = 1e-10 * normalize(Ξb);
We solve the perturbed problem using pivoted LU and see how the solution was changed.
new_x = ((A + ΞA) \ (b + Ξb))
Ξx = new_x - x6-element Vector{Float64}:
-3.381709709338043e-5
0.0005667519323755421
-0.002923429505386377
0.006407006686722561
-0.006272818214712039
0.0022604173596221244
Here is the relative error in the solution.
@show relative_error = norm(Ξx) / norm(x);relative_error = norm(Ξx) / norm(x) = 0.001018381937354831
And here are upper bounds predicted using the condition number of the original matrix.
println("Upper bound due to b: $(ΞΊ * norm(Ξb) / norm(b))")
println("Upper bound due to A: $(ΞΊ * opnorm(ΞA) / opnorm(A))")Upper bound due to b: 0.0006723667714415053
Upper bound due to A: 0.004566989873968762
Even if we didnβt make any manual perturbations to the data, machine roundoff does so at the relative level of Ο΅ mach \macheps Ο΅ mach β .
Ξx = A\b - x
@show relative_error = norm(Ξx) / norm(x);
@show rounding_bound = ΞΊ * eps();relative_error = norm(Ξx) / norm(x) = 7.822650774976615e-10
rounding_bound = ΞΊ * eps() = 1.134607141124313e-8
Larger Hilbert matrices are even more poorly conditioned:
A = [ 1 / (i + j) for i=1:14, j=1:14 ];
ΞΊ = cond(A)Note that ΞΊ \kappa ΞΊ exceeds 1 / Ο΅ mach 1/\macheps 1/ Ο΅ mach β . In principle we therefore may end up with an answer that has relative error greater than 100%.
rounding_bound = ΞΊ*eps()Letβs put that prediction to the test.
x = 1:14
b = A * x
Ξx = A\b - x
@show relative_error = norm(Ξx) / norm(x);relative_error = norm(Ξx) / norm(x) = 4.469466154206132
As anticipated, the solution has zero accurate digits in the 2-norm.
MATLAB has a function cond to compute matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the 6 Γ 6 6\times 6 6 Γ 6 case.
A = hilb(6)
kappa = cond(A)Because ΞΊ β 1 0 8 \kappa\approx 10^8 ΞΊ β 1 0 8 , itβs possible to lose nearly 8 digits of accuracy in the process of passing from A \mathbf{A} A and b \mathbf{b} b to x \mathbf{x} x . That fact is independent of the algorithm; itβs inevitable once the data are expressed in finite precision.
Letβs engineer a linear system problem to observe the effect of a perturbation. We will make sure we know the exact answer.
Now we perturb the system matrix and vector randomly by 10-10 in norm.
dA = randn(size(A)); dA = 1e-10 * (dA / norm(dA));
db = randn(size(b)); db = 1e-10 * (db / norm(db));
We solve the perturbed problem using pivoted LU and see how the solution was changed.
new_x = ((A + dA) \ (b + db));
dx = new_x - x;
Here is the relative error in the solution.
relative_error = norm(dx) / norm(x)And here are upper bounds predicted using the condition number of the original matrix.
upper_bound_b = (kappa * norm(db) / norm(b))
upper_bound_A = (kappa * norm(dA) / norm(A))Even if we didnβt make any manual perturbations to the data, machine roundoff does so at the relative level of Ο΅ mach \macheps Ο΅ mach β .
dx = A\b - x;
relative_error = norm(dx) / norm(x)
rounding_bound = kappa * epsLarger Hilbert matrices are even more poorly conditioned:
A = hilb(14);
kappa = cond(A)Note that ΞΊ \kappa ΞΊ exceeds 1 / Ο΅ mach 1/\macheps 1/ Ο΅ mach β . In principle we therefore may end up with an answer that has relative error greater than 100%.
rounding_bound = kappa * epsLetβs put that prediction to the test.
x = (1:14)'; b = A * x;
dx = A\b - x;
relative_error = norm(dx) / norm(x)Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 7.292772e-20. As anticipated, the solution has zero accurate digits in the 2-norm.
The function cond from scipy.linalg is used to computes matrix condition numbers. By default, the 2-norm is used. As an example, the family of Hilbert matrices is famously badly conditioned. Here is the 6 Γ 6 6\times 6 6 Γ 6 case.
A = array([
[1/(i + j + 2) for j in range(6)]
for i in range(6)
])
print(A)[[0.5 0.33333333 0.25 0.2 0.16666667 0.14285714]
[0.33333333 0.25 0.2 0.16666667 0.14285714 0.125 ]
[0.25 0.2 0.16666667 0.14285714 0.125 0.11111111]
[0.2 0.16666667 0.14285714 0.125 0.11111111 0.1 ]
[0.16666667 0.14285714 0.125 0.11111111 0.1 0.09090909]
[0.14285714 0.125 0.11111111 0.1 0.09090909 0.08333333]]
from numpy.linalg import cond
from scipy import linalg
kappa = cond(A)
print(f"kappa is {kappa:.3e}")Next we engineer a linear system problem to which we know the exact answer.
x_exact = 1.0 + arange(6)
b = A @ x_exact
Now we perturb the data randomly with a vector of norm 10-12 .
dA = random.randn(6, 6)
dA = 1e-12 * (dA / linalg.norm(dA, 2))
db = random.randn(6)
db = 1e-12 * (db / linalg.norm(db, 2))
We solve the perturbed problem using built-in pivoted LU and see how the solution was changed.
x = linalg.solve(A + dA, b + db)
dx = x - x_exact
Here is the relative error in the solution.
print(f"relative error is {linalg.norm(dx) / linalg.norm(x_exact):.2e}")relative error is 1.71e-05
And here are upper bounds predicted using the condition number of the original matrix.
print(f"b_bound: {kappa * 1e-12 / linalg.norm(b):.2e}")
print(f"A_bound: {kappa * 1e-12 / linalg.norm(A, 2):.2e}")b_bound: 6.72e-06
A_bound: 4.57e-05
Even if we donβt make any manual perturbations to the data, machine epsilon does when we solve the linear system numerically.
x = linalg.solve(A, b)
print(f"relative error: {linalg.norm(x - x_exact) / linalg.norm(x_exact):.2e}")
print(f"rounding bound: {kappa / 2**52:.2e}")relative error: 8.66e-10
rounding bound: 1.13e-08
Because ΞΊ β 1 0 8 \kappa\approx 10^8 ΞΊ β 1 0 8 , itβs possible to lose 8 digits of accuracy in the process of passing from A A A and b b b to x x x . Thatβs independent of the algorithm; itβs inevitable once the data are expressed in double precision.
Larger Hilbert matrices are even more poorly conditioned.
A = array([ [1/(i+j+2) for j in range(14)] for i in range(14) ])
kappa = cond(A)
print(f"kappa is {kappa:.3e}")Before we compute the solution, note that ΞΊ \kappa ΞΊ exceeds 1/eps. In principle we therefore might end up with an answer that is completely wrong (i.e., a relative error greater than 100%).
print(f"rounding bound: {kappa / 2**52:.2e}")x_exact = 1.0 + arange(14)
b = A @ x_exact
x = linalg.solve(A, b)/Users/driscoll/miniforge3/envs/myst/lib/python3.13/site-packages/scipy/_lib/_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=6.26637e-19): result may not be accurate.
return f(*arrays, *other_args, **kwargs)
We got an answer. But in fact, the error does exceed 100%:
print(f"relative error: {linalg.norm(x - x_exact) / linalg.norm(x_exact):.2e}")2.8.2 Residual and backward error ΒΆ Suppose that A x = b \mathbf{A}\mathbf{x}=\mathbf{b} Ax = b and x ~ \tilde{\mathbf{x}} x ~ is a computed estimate of the solution x \mathbf{x} x . The most natural quantity to study is the error, x β x ~ \mathbf{x}-\tilde{\mathbf{x}} x β x ~ . Normally we canβt compute it because we donβt know the exact solution. However, we can compute something related.
Obviously, a zero residual means that x ~ = x \tilde{\mathbf{x}}=\mathbf{x} x ~ = x , and we have the exact solution. What happens more generally? Note that A x ~ = b β r \mathbf{A}\tilde{\mathbf{x}}=\mathbf{b}-\mathbf{r} A x ~ = b β r . That is, x ~ \tilde{\mathbf{x}} x ~ solves the linear system problem for a right-hand side that is changed by β r -\mathbf{r} β r . This is precisely what is meant by backward error.
Hence residual and backward error are the same thing for a linear system. What is the connection to the (forward) error? We can reconnect with (2.8.6) by the definition h = x ~ β x \mathbf{h} = \tilde{\mathbf{x}}-\mathbf{x} h = x ~ β x , in which case
d = A ( x + h ) β b = A h = β r . \mathbf{d} = \mathbf{A}(\mathbf{x}+\mathbf{h})-\mathbf{b}=\mathbf{A}\mathbf{h} = -\mathbf{r}. d = A ( x + h ) β b = Ah = β r . Thus, (2.8.6) is equivalent to
β₯ x β x ~ β₯ β₯ x β₯ β€ ΞΊ ( A ) β₯ r β₯ β₯ b β₯ . \frac{\| \mathbf{x}-\tilde{\mathbf{x}} \|}{\| \mathbf{x} \|} \le
\kappa(\mathbf{A}) \frac{\| \mathbf{r} \|}{\| \mathbf{b} \|}. β₯ x β₯ β₯ x β x ~ β₯ β β€ ΞΊ ( A ) β₯ b β₯ β₯ r β₯ β . Equation (2.8.11) says that the gap between relative error and the relative residual is a multiplication by the matrix condition number.
When solving a linear system, all that can be expected is that the backward error, not the error, is small.
2.8.3 Exercises ΒΆ β¨ Refer to ExampleΒ 2.8.1 for the definition of a Hilbert matrix. Make a table of the values of ΞΊ ( H n ) \kappa(\mathbf{H}_n) ΞΊ ( H n β ) in the 2-norm for n = 2 , 3 , β¦ , 16 n=2,3,\ldots,16 n = 2 , 3 , β¦ , 16 . Speculate as to why the growth of ΞΊ \kappa ΞΊ appears to slow down at n = 13 n=13 n = 13 .
β¨ The purpose of this problem is to verify, like in ExampleΒ 2.8.1 , the error bound
β₯ x β x β₯ ~ β₯ x β₯ β€ ΞΊ ( A ) β₯ h β₯ β₯ b β₯ . \frac{\| \mathbf{x}-\tilde{\mathbf{x} \|}}{\| \mathbf{x} \|} \le \kappa(\mathbf{A})
\frac{\| \mathbf{h} \|}{\| \mathbf{b} \|}. β₯ x β₯ β₯ x β x β₯ ~ β β β€ ΞΊ ( A ) β₯ b β₯ β₯ h β₯ β . Here x ~ \tilde{\mathbf{x}} x ~ is a numerical approximation to the exact solution x \mathbf{x} x , and h \mathbf{h} h is an unknown perturbation caused by machine roundoff. We will assume that β₯ h β₯ / β₯ b β₯ \| \mathbf{h} \|/\| \mathbf{b} \| β₯ h β₯/β₯ b β₯ is roughly eps().
Use the MatrixDepot package. For each n = 10 , 20 , β¦ , 70 n=10,20,\ldots,70 n = 10 , 20 , β¦ , 70 , let A = matrixdepot("prolate", n, 0.4) and let x \mathbf{x} x have components x k = k / n x_k=k/n x k β = k / n for k = 1 , β¦ , n k=1,\ldots,n k = 1 , β¦ , n . Define b = A*x and let xtilde be the solution produced numerically by backslash.
For each n = 10 , 20 , β¦ , 70 n=10,20,\ldots,70 n = 10 , 20 , β¦ , 70 , let A = gallery("prolate", n, 0.4) and let x \mathbf{x} x have components x k = k / n x_k=k/n x k β = k / n for k = 1 , β¦ , n k=1,\ldots,n k = 1 , β¦ , n . Define b = A*x and let xtilde be the solution produced numerically by backslash.
You will need to import the rogues package from PyPi. For each n = 10 , 20 , β¦ , 70 n=10,20,\ldots,70 n = 10 , 20 , β¦ , 70 , let A = rogues.prolate(n, 0.4) and let x \mathbf{x} x have components x k = k / n x_k=k/n x k β = k / n for k = 1 , β¦ , n k=1,\ldots,n k = 1 , β¦ , n . Define b = A@x and let xtilde be the solution produced numerically by numpy.linalg.solve.
Make a table including columns for n n n , the condition number of A \mathbf{A} A , the observed relative error in x ~ \tilde{\mathbf{x}} x ~ , and the right-hand side of the inequality above. You should find that the inequality holds in every case.
β¨ ExerciseΒ 2.3.7 suggests that the solutions of linear systems
A = [ 1 β 1 0 Ξ± β Ξ² Ξ² 0 1 β 1 0 0 0 0 1 β 1 0 0 0 0 1 β 1 0 0 0 0 1 ] , b = [ Ξ± 0 0 0 1 ] \mathbf{A} = \begin{bmatrix} 1 & -1 & 0 & \alpha-\beta & \beta \\ 0 & 1 & -1 &
0 & 0 \\ 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 & 1
\end{bmatrix}, \quad
\mathbf{b} = \begin{bmatrix} \alpha \\ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix} A = β£ β‘ β 1 0 0 0 0 β β 1 1 0 0 0 β 0 β 1 1 0 0 β Ξ± β Ξ² 0 β 1 1 0 β Ξ² 0 0 β 1 1 β β¦ β€ β , b = β£ β‘ β Ξ± 0 0 0 1 β β¦ β€ β become less accurate as Ξ² \beta Ξ² increases. Using Ξ± = 0.1 \alpha=0.1 Ξ± = 0.1 and Ξ² = 10 , 1 0 2 , β¦ , 1 0 12 \beta=10,10^2,\ldots,10^{12} Ξ² = 10 , 1 0 2 , β¦ , 1 0 12 , make a table with columns for Ξ² \beta Ξ² , β£ x 1 β 1 β£ |x_1-1| β£ x 1 β β 1β£ , and the condition number of the matrix.
β¨ Let A n \mathbf{A}_n A n β denote the ( n + 1 ) Γ ( n + 1 ) (n+1)\times(n+1) ( n + 1 ) Γ ( n + 1 ) version of the Vandermonde matrix in (2.1.3) based on the equally spaced interpolation nodes t i = i / n t_i=i/n t i β = i / n for i = 0 , β¦ , n i=0,\ldots,n i = 0 , β¦ , n . Using the 1-norm, plot ΞΊ ( A n ) \kappa(\mathbf{A}_n) ΞΊ ( A n β ) as a function of n n n for n = 4 , 5 , 6 , β¦ , 20 n=4,5,6,\ldots,20 n = 4 , 5 , 6 , β¦ , 20 , using a log scale on the y y y -axis. (The graph is nearly a straight line.)
β¨ The matrix A \mathbf{A} A in (2.6.2) has unpivoted LU factors given in (2.6.3) as a function of parameter Ο΅ \epsilon Ο΅ . For Ο΅ = 1 0 β 2 , 1 0 β 4 , β¦ , 1 0 β 10 \epsilon = 10^{-2},10^{-4},\ldots,10^{-10} Ο΅ = 1 0 β 2 , 1 0 β 4 , β¦ , 1 0 β 10 , make a table with columns for Ο΅ \epsilon Ο΅ , ΞΊ ( A ) \kappa(\mathbf{A}) ΞΊ ( A ) , ΞΊ ( L ) \kappa(\mathbf{L}) ΞΊ ( L ) , and ΞΊ ( U ) \kappa(\mathbf{U}) ΞΊ ( U ) . (This shows that solution via unpivoted LU factorization is arbitrarily unstable.)
β Define A n \mathbf{A}_n A n β as the n Γ n n\times n n Γ n matrix [ 1 β 2 1 β 2 β± β± 1 β 2 1 ] . \displaystyle\begin{bmatrix}
1 & -2 & & &\\
& 1 & -2 & & \\
& & \ddots & \ddots & \\
& & & 1 & -2 \\
& & & & 1
\end{bmatrix}. β£ β‘ β 1 β β 2 1 β β 2 β± β β± 1 β β 2 1 β β¦ β€ β .
(a) Write out A 2 β 1 \mathbf{A}_2^{-1} A 2 β 1 β and A 3 β 1 \mathbf{A}_3^{-1} A 3 β 1 β .
(b) Write out A n β 1 \mathbf{A}_n^{-1} A n β 1 β in the general case n > 1 n>1 n > 1 . (If necessary, look at a few more cases on the computer until you are certain of the pattern.) Make a clear argument why it is correct.
(c) Using the β \infty β -norm, find ΞΊ ( A n ) \kappa(\mathbf{A}_n) ΞΊ ( A n β ) .
β (a) Prove that for n Γ n n\times n n Γ n nonsingular matrices A \mathbf{A} A and B \mathbf{B} B , ΞΊ ( A B ) β€ ΞΊ ( A ) ΞΊ ( B ) \kappa(\mathbf{A}\mathbf{B})\le \kappa(\mathbf{A})\kappa(\mathbf{B}) ΞΊ ( AB ) β€ ΞΊ ( A ) ΞΊ ( B ) .
(b) Show by means of an example that the result of part (a) cannot be an equality in general.
β Let D \mathbf{D} D be a diagonal n Γ n n\times n n Γ n matrix, not necessarily invertible. Prove that in the 1-norm,
ΞΊ ( D ) = max β‘ i β£ D i i β£ min β‘ i β£ D i i β£ . \kappa(\mathbf{D}) = \frac{\max_i |D_{ii}|}{\min_i |D_{ii}|}. ΞΊ ( D ) = min i β β£ D ii β β£ max i β β£ D ii β β£ β . (Hint: See ExerciseΒ 2.7.10 .)