Lecture Topic: Systems of Linear Equations
Introduction: systems of linear equationsWe examine both iterative and direct methods for solving equations
Ax = b (1)
where x , b ∈ Rn, A ∈ MnR is an n × n matrix,x is an unknown vector (to be found) and b and A are known.
Solving systems of linear equations is still the most important problem incomputational mathematics, because it is used as a sub-problem in solving otherproblems.
Algorithms that solve non-linear systems commonly use linear approximations,which give rise to systems of linear equations.
Algorithms that optimise over feasible sets given by linear and non-linear equalitiesand inequalities commonly solve systems related to first-order optimalityconditions iteratively, which give rise to systems of linear equations.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 2 / 1
Introduction: systems of linear equationsWe examine both iterative and direct methods for solving equations
Ax = b (1)
where x , b ∈ Rn, A ∈ MnR is an n × n matrix,x is an unknown vector (to be found) and b and A are known.
Solving systems of linear equations is still the most important problem incomputational mathematics, because it is used as a sub-problem in solving otherproblems.
Algorithms that solve non-linear systems commonly use linear approximations,which give rise to systems of linear equations.
Algorithms that optimise over feasible sets given by linear and non-linear equalitiesand inequalities commonly solve systems related to first-order optimalityconditions iteratively, which give rise to systems of linear equations.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 2 / 1
Introduction: systems of linear equationsWe examine both iterative and direct methods for solving equations
Ax = b (1)
where x , b ∈ Rn, A ∈ MnR is an n × n matrix,x is an unknown vector (to be found) and b and A are known.
Solving systems of linear equations is still the most important problem incomputational mathematics, because it is used as a sub-problem in solving otherproblems.
Algorithms that solve non-linear systems commonly use linear approximations,which give rise to systems of linear equations.
Algorithms that optimise over feasible sets given by linear and non-linear equalitiesand inequalities commonly solve systems related to first-order optimalityconditions iteratively, which give rise to systems of linear equations.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 2 / 1
Mathematics and Algorithms
In Quantitative Methods you have learned the mathematical ideas behind directand iterative approaches to solving (1). You should understand:
Theorem
The following are equivalent for any n × n matrix A:
Ax = b has a unique solution of all b ∈ R.
Ax = 0 implies x = 0.
A−1 exists.
det(A) 6= 0.
rank(A) = n.
The full rank of A is also our assumption throughout the chapter.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 3 / 1
Mathematics and Algorithms
Here, we build on these and analyse the related algorithms, focussing first onconditioning, second on the stability of direct methods, and third on theconvergence and stability of iterative methods.
This still leaves much unexplained, including conjugate gradients (CG),generalised minimal residuals (GMRES), and preconditioning, i.e. methods forchanging the condition.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 4 / 1
Mathematics and Algorithms
Here, we build on these and analyse the related algorithms, focussing first onconditioning, second on the stability of direct methods, and third on theconvergence and stability of iterative methods.
This still leaves much unexplained, including conjugate gradients (CG),generalised minimal residuals (GMRES), and preconditioning, i.e. methods forchanging the condition.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 4 / 1
Condition of a System of Linear Equations
Condition suggests how changes in A and b, the “instance” of the problem, affectthe solution x , using any algorithm.
We will examine errors in A and b separately.
It turns out that in both cases the condition number of the matrix A plays a role.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 5 / 1
Condition of a System of Linear Equations
Condition suggests how changes in A and b, the “instance” of the problem, affectthe solution x , using any algorithm.
We will examine errors in A and b separately.
It turns out that in both cases the condition number of the matrix A plays a role.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 5 / 1
Condition of a System of Linear Equations
Condition suggests how changes in A and b, the “instance” of the problem, affectthe solution x , using any algorithm.
We will examine errors in A and b separately.
It turns out that in both cases the condition number of the matrix A plays a role.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 5 / 1
An ExampleConsider the system of linear equations
x1 + 0.99x2 = 1.99
0.99x1 + 0.98x2 = 1.97.
The true solution is x1 = 1 and x2 = 1 but x1 = 3.0000 and x2 = −1.0203 gives
x1 + 0.99x2 = 1.989903
0.99x1 + 0.98x2 = 1.970106.
Thus, a small change in the problem data, a change in the vector b from(1.991.97
)to
(1.9899031.970106
),
leads to a large change in the solution:this is our criterion for ill-conditioning.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 6 / 1
An ExampleConsider the system of linear equations
x1 + 0.99x2 = 1.99
0.99x1 + 0.98x2 = 1.97.
The true solution is x1 = 1 and x2 = 1 but x1 = 3.0000 and x2 = −1.0203 gives
x1 + 0.99x2 = 1.989903
0.99x1 + 0.98x2 = 1.970106.
Thus, a small change in the problem data, a change in the vector b from(1.991.97
)to
(1.9899031.970106
),
leads to a large change in the solution:this is our criterion for ill-conditioning.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 6 / 1
An ExampleConsider the system of linear equations
x1 + 0.99x2 = 1.99
0.99x1 + 0.98x2 = 1.97.
The true solution is x1 = 1 and x2 = 1 but x1 = 3.0000 and x2 = −1.0203 gives
x1 + 0.99x2 = 1.989903
0.99x1 + 0.98x2 = 1.970106.
Thus, a small change in the problem data, a change in the vector b from(1.991.97
)to
(1.9899031.970106
),
leads to a large change in the solution:this is our criterion for ill-conditioning.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 6 / 1
An ExampleConsider the system of linear equations
x1 + 0.99x2 = 1.99
0.99x1 + 0.98x2 = 1.97.
The true solution is x1 = 1 and x2 = 1 but x1 = 3.0000 and x2 = −1.0203 gives
x1 + 0.99x2 = 1.989903
0.99x1 + 0.98x2 = 1.970106.
Thus, a small change in the problem data, a change in the vector b from(1.991.97
)to
(1.9899031.970106
),
leads to a large change in the solution:this is our criterion for ill-conditioning.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 6 / 1
An Example
−1 0 1 2 3−1
0
1
2
3
x1
x2
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 7 / 1
An Example
0.9999 1.0000 1.00010.9999
1.0000
1.0001
x1
x2
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 8 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
Perturbation of bLet the right-hand-side b be perturbed by δb. So we want to find the solution of
A(x + δx) = b + δb. (2.1)
‖ ‖ denotes a vector or matrix norm, according to context. By (2.1), Aδx = δb, so
δx = A−1δb ⇒ ‖δx‖ ≤ ‖A−1‖ ‖δb‖ (a sharp bound). (2.2)
Since b = Ax , the properties of matrix norms again give
‖b‖ ≤ ‖A‖ ‖x‖. (2.3)
Hence, combining (2.2) and (2.3): each LHS≤RHS, so∏
LHS≤∏RHS:
‖δx‖ ‖b‖ ≤ ‖A‖ ‖A−1‖ ‖x‖ ‖δb‖and assuming b 6= 0 we get
‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖‖δb‖‖b‖i.e., (rel. error in x) ≤ ‖A‖ ‖A−1‖(rel. error in b).
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 9 / 1
The Condition Number of a MatrixThus, the quantity ‖A‖ ‖A−1‖ measures the relative change in solution for a givenrelative change in problem: it measures the relative condition of the system oflinear equations problem.
Definition
Given a matrix norm ‖ ‖, the condition number of matrix A is
condrel(A) = ‖A‖‖A−1‖.
This depends on the norm used; but, since the underlying vector norms only differby a fixed multiplicative constant for a given n (all norms on Rn are equivalent),all measures of condition number are equally good.
We can interpret condrel(A) as:
the amount a relative error in b is magnified in the solution vector x ; or
the distortion A produces when applied to the unit sphere; or
how “close” A (and indeed A−1) is to being a singular matrix.
The Condition Number of a MatrixThus, the quantity ‖A‖ ‖A−1‖ measures the relative change in solution for a givenrelative change in problem: it measures the relative condition of the system oflinear equations problem.
Definition
Given a matrix norm ‖ ‖, the condition number of matrix A is
condrel(A) = ‖A‖‖A−1‖.
This depends on the norm used; but, since the underlying vector norms only differby a fixed multiplicative constant for a given n (all norms on Rn are equivalent),all measures of condition number are equally good.
We can interpret condrel(A) as:
the amount a relative error in b is magnified in the solution vector x ; or
the distortion A produces when applied to the unit sphere; or
how “close” A (and indeed A−1) is to being a singular matrix.
The Condition Number of a MatrixThus, the quantity ‖A‖ ‖A−1‖ measures the relative change in solution for a givenrelative change in problem: it measures the relative condition of the system oflinear equations problem.
Definition
Given a matrix norm ‖ ‖, the condition number of matrix A is
condrel(A) = ‖A‖‖A−1‖.
This depends on the norm used; but, since the underlying vector norms only differby a fixed multiplicative constant for a given n (all norms on Rn are equivalent),all measures of condition number are equally good.
We can interpret condrel(A) as:
the amount a relative error in b is magnified in the solution vector x ; or
the distortion A produces when applied to the unit sphere; or
how “close” A (and indeed A−1) is to being a singular matrix.
The Condition Number of a MatrixThus, the quantity ‖A‖ ‖A−1‖ measures the relative change in solution for a givenrelative change in problem: it measures the relative condition of the system oflinear equations problem.
Definition
Given a matrix norm ‖ ‖, the condition number of matrix A is
condrel(A) = ‖A‖‖A−1‖.
This depends on the norm used; but, since the underlying vector norms only differby a fixed multiplicative constant for a given n (all norms on Rn are equivalent),all measures of condition number are equally good.
We can interpret condrel(A) as:
the amount a relative error in b is magnified in the solution vector x ; or
the distortion A produces when applied to the unit sphere; or
how “close” A (and indeed A−1) is to being a singular matrix.
The Condition Number of a MatrixThus, the quantity ‖A‖ ‖A−1‖ measures the relative change in solution for a givenrelative change in problem: it measures the relative condition of the system oflinear equations problem.
Definition
Given a matrix norm ‖ ‖, the condition number of matrix A is
condrel(A) = ‖A‖‖A−1‖.
This depends on the norm used; but, since the underlying vector norms only differby a fixed multiplicative constant for a given n (all norms on Rn are equivalent),all measures of condition number are equally good.
We can interpret condrel(A) as:
the amount a relative error in b is magnified in the solution vector x ; or
the distortion A produces when applied to the unit sphere; or
how “close” A (and indeed A−1) is to being a singular matrix.
The Spectral Condition Number of a MatrixDefinition
We also define the spectral condition number of A as
cond*rel(A) :=
maxλ∈σ(A) |λ|minλ∈σ(A) |λ|
.
Here σ(A), the spectrum of A, is the set of all eigenvalues of A.
If λ is an eigenvalue of A, its modulus or length |λ| is the factor by which aλ-eigenvector is expanded (if |λ| > 1) or contracted (if |λ| < 1). Thus
ρ(A) = maxλ∈σ(A) |λ|, the spectral radius, is the largest factor by which Amultiplies an eigenvector, whileminλ∈σ(A) |λ| is the smallest factor by which A multiplies an eigenvector.
The ratio cond*rel(A) is thus a measure of the distortion produced by A:
how great is the difference in expansion/contraction of eigenvectors that A cancause.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 11 / 1
The Spectral Condition Number of a MatrixDefinition
We also define the spectral condition number of A as
cond*rel(A) :=
maxλ∈σ(A) |λ|minλ∈σ(A) |λ|
.
Here σ(A), the spectrum of A, is the set of all eigenvalues of A.
If λ is an eigenvalue of A, its modulus or length |λ| is the factor by which aλ-eigenvector is expanded (if |λ| > 1) or contracted (if |λ| < 1). Thus
ρ(A) = maxλ∈σ(A) |λ|, the spectral radius, is the largest factor by which Amultiplies an eigenvector, whileminλ∈σ(A) |λ| is the smallest factor by which A multiplies an eigenvector.
The ratio cond*rel(A) is thus a measure of the distortion produced by A:
how great is the difference in expansion/contraction of eigenvectors that A cancause.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 11 / 1
The Spectral Condition Number of a MatrixDefinition
We also define the spectral condition number of A as
cond*rel(A) :=
maxλ∈σ(A) |λ|minλ∈σ(A) |λ|
.
Here σ(A), the spectrum of A, is the set of all eigenvalues of A.
If λ is an eigenvalue of A, its modulus or length |λ| is the factor by which aλ-eigenvector is expanded (if |λ| > 1) or contracted (if |λ| < 1). Thus
ρ(A) = maxλ∈σ(A) |λ|, the spectral radius, is the largest factor by which Amultiplies an eigenvector, whileminλ∈σ(A) |λ| is the smallest factor by which A multiplies an eigenvector.
The ratio cond*rel(A) is thus a measure of the distortion produced by A:
how great is the difference in expansion/contraction of eigenvectors that A cancause.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 11 / 1
The Spectral Condition Number of a MatrixDefinition
We also define the spectral condition number of A as
cond*rel(A) :=
maxλ∈σ(A) |λ|minλ∈σ(A) |λ|
.
Here σ(A), the spectrum of A, is the set of all eigenvalues of A.
If λ is an eigenvalue of A, its modulus or length |λ| is the factor by which aλ-eigenvector is expanded (if |λ| > 1) or contracted (if |λ| < 1). Thus
ρ(A) = maxλ∈σ(A) |λ|, the spectral radius, is the largest factor by which Amultiplies an eigenvector, whileminλ∈σ(A) |λ| is the smallest factor by which A multiplies an eigenvector.
The ratio cond*rel(A) is thus a measure of the distortion produced by A:
how great is the difference in expansion/contraction of eigenvectors that A cancause.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 11 / 1
The Spectral Condition Number of a MatrixDefinition
We also define the spectral condition number of A as
cond*rel(A) :=
maxλ∈σ(A) |λ|minλ∈σ(A) |λ|
.
Here σ(A), the spectrum of A, is the set of all eigenvalues of A.
If λ is an eigenvalue of A, its modulus or length |λ| is the factor by which aλ-eigenvector is expanded (if |λ| > 1) or contracted (if |λ| < 1). Thus
ρ(A) = maxλ∈σ(A) |λ|, the spectral radius, is the largest factor by which Amultiplies an eigenvector, whileminλ∈σ(A) |λ| is the smallest factor by which A multiplies an eigenvector.
The ratio cond*rel(A) is thus a measure of the distortion produced by A:
how great is the difference in expansion/contraction of eigenvectors that A cancause.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 11 / 1
The Spectral Condition Number of a MatrixDefinition
We also define the spectral condition number of A as
cond*rel(A) :=
maxλ∈σ(A) |λ|minλ∈σ(A) |λ|
.
Here σ(A), the spectrum of A, is the set of all eigenvalues of A.
If λ is an eigenvalue of A, its modulus or length |λ| is the factor by which aλ-eigenvector is expanded (if |λ| > 1) or contracted (if |λ| < 1). Thus
ρ(A) = maxλ∈σ(A) |λ|, the spectral radius, is the largest factor by which Amultiplies an eigenvector, whileminλ∈σ(A) |λ| is the smallest factor by which A multiplies an eigenvector.
The ratio cond*rel(A) is thus a measure of the distortion produced by A:
how great is the difference in expansion/contraction of eigenvectors that A cancause.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 11 / 1
An Example
In the example above, the related matrix
A =
(1.00 .99.99 .98
).
has eigenvalues λ1 = 1.98, λ2 = −0.00005 of the characteristic equation,
det(A− λI ) = det
(1.00− λ .99.99 .98− λ
)= (1− λ)(.98− λ)− .992
= λ2 − 1.98λ+ .98− .9801 = λ2 − 1.98λ− .0001.
Thus the spectral condition number is cond*rel(A) = |1.98|/| − 0.00005| =39,600.
Hence, this matrix is very ill-conditioned.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 12 / 1
An Example
In the example above, the related matrix
A =
(1.00 .99.99 .98
).
has eigenvalues λ1 = 1.98, λ2 = −0.00005 of the characteristic equation,
det(A− λI ) = det
(1.00− λ .99.99 .98− λ
)= (1− λ)(.98− λ)− .992
= λ2 − 1.98λ+ .98− .9801 = λ2 − 1.98λ− .0001.
Thus the spectral condition number is cond*rel(A) = |1.98|/| − 0.00005| =39,600.
Hence, this matrix is very ill-conditioned.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 12 / 1
Properties of the Condition NumberThe condition number condrel(A) is bounded below by 1:this is seen by noting that ‖I‖ = 1 for any norm and
1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ = condrel(A).
Fact
Each norm-based condition number is also bounded below by the spectralcondition number of A:
1 ≤ cond*rel(A) ≤ condrel(A)
for any norm.
Thus the spectral condition number is the smallest measure of relative conditionof the system of linear equations problem.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 13 / 1
Properties of the Condition NumberThe condition number condrel(A) is bounded below by 1:this is seen by noting that ‖I‖ = 1 for any norm and
1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ = condrel(A).
Fact
Each norm-based condition number is also bounded below by the spectralcondition number of A:
1 ≤ cond*rel(A) ≤ condrel(A)
for any norm.
Thus the spectral condition number is the smallest measure of relative conditionof the system of linear equations problem.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 13 / 1
Properties of the Condition NumberThe condition number condrel(A) is bounded below by 1:this is seen by noting that ‖I‖ = 1 for any norm and
1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ = condrel(A).
Fact
Each norm-based condition number is also bounded below by the spectralcondition number of A:
1 ≤ cond*rel(A) ≤ condrel(A)
for any norm.
Thus the spectral condition number is the smallest measure of relative conditionof the system of linear equations problem.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 13 / 1
Perturbation of AIf A is perturbed by δA then we have
b = (A + δA)(x + δx)
= Ax + Aδx + δAx + δAδx
⇒ Aδx = −δA(x + δx)
⇒ δx = −A−1δA(x + δx)
Taking norms and using the triangle inequality we have∥∥δx∥∥ =∥∥A−1δA(x + δx)
∥∥≤ ‖A−1‖‖δA‖(‖x‖+ ‖δx‖)
⇒ ‖δx‖(1− ‖A−1‖ ‖δA‖
)≤ ‖A−1‖ ‖δA‖ ‖x‖.
Thus‖δx‖‖x‖ ≤ ‖A−1‖ ‖δA‖
1− ‖A−1‖ ‖δA‖ =‖A‖ ‖A−1‖
(1− ‖A−1‖ ‖δA‖)‖δA‖‖A‖
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 14 / 1
Perturbation of AIf A is perturbed by δA then we have
b = (A + δA)(x + δx)
= Ax + Aδx + δAx + δAδx
⇒ Aδx = −δA(x + δx)
⇒ δx = −A−1δA(x + δx)
Taking norms and using the triangle inequality we have∥∥δx∥∥ =∥∥A−1δA(x + δx)
∥∥≤ ‖A−1‖‖δA‖(‖x‖+ ‖δx‖)
⇒ ‖δx‖(1− ‖A−1‖ ‖δA‖
)≤ ‖A−1‖ ‖δA‖ ‖x‖.
Thus‖δx‖‖x‖ ≤ ‖A−1‖ ‖δA‖
1− ‖A−1‖ ‖δA‖ =‖A‖ ‖A−1‖
(1− ‖A−1‖ ‖δA‖)‖δA‖‖A‖
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 14 / 1
Perturbation of AIf A is perturbed by δA then we have
b = (A + δA)(x + δx)
= Ax + Aδx + δAx + δAδx
⇒ Aδx = −δA(x + δx)
⇒ δx = −A−1δA(x + δx)
Taking norms and using the triangle inequality we have∥∥δx∥∥ =∥∥A−1δA(x + δx)
∥∥≤ ‖A−1‖‖δA‖(‖x‖+ ‖δx‖)
⇒ ‖δx‖(1− ‖A−1‖ ‖δA‖
)≤ ‖A−1‖ ‖δA‖ ‖x‖.
Thus‖δx‖‖x‖ ≤ ‖A−1‖ ‖δA‖
1− ‖A−1‖ ‖δA‖ =‖A‖ ‖A−1‖
(1− ‖A−1‖ ‖δA‖)‖δA‖‖A‖
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 14 / 1
The Condition Number Again. . .
Since 1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ we have 1‖A‖ ≤ ‖A−1‖.
Thus, if ‖A−1‖ ‖δA‖ � 1 (and so ‖δA‖/‖A‖ � 1),
then 1− ‖A−1‖ ‖δA‖ ≈ 1 and we have
‖δx‖‖x‖ ≤ condrel(A)
‖δA‖‖A‖ .
Thus, for a small perturbation of A, we again have that condition numbermeasures the relative condition of the system of linear equations problem.
A similar result can be derived for the case where both A and b are perturbed.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 15 / 1
The Condition Number Again. . .
Since 1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ we have 1‖A‖ ≤ ‖A−1‖.
Thus, if ‖A−1‖ ‖δA‖ � 1 (and so ‖δA‖/‖A‖ � 1),
then 1− ‖A−1‖ ‖δA‖ ≈ 1 and we have
‖δx‖‖x‖ ≤ condrel(A)
‖δA‖‖A‖ .
Thus, for a small perturbation of A, we again have that condition numbermeasures the relative condition of the system of linear equations problem.
A similar result can be derived for the case where both A and b are perturbed.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 15 / 1
The Condition Number Again. . .
Since 1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ we have 1‖A‖ ≤ ‖A−1‖.
Thus, if ‖A−1‖ ‖δA‖ � 1 (and so ‖δA‖/‖A‖ � 1),
then 1− ‖A−1‖ ‖δA‖ ≈ 1 and we have
‖δx‖‖x‖ ≤ condrel(A)
‖δA‖‖A‖ .
Thus, for a small perturbation of A, we again have that condition numbermeasures the relative condition of the system of linear equations problem.
A similar result can be derived for the case where both A and b are perturbed.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 15 / 1
The Condition Number Again. . .
Since 1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ we have 1‖A‖ ≤ ‖A−1‖.
Thus, if ‖A−1‖ ‖δA‖ � 1 (and so ‖δA‖/‖A‖ � 1),
then 1− ‖A−1‖ ‖δA‖ ≈ 1 and we have
‖δx‖‖x‖ ≤ condrel(A)
‖δA‖‖A‖ .
Thus, for a small perturbation of A, we again have that condition numbermeasures the relative condition of the system of linear equations problem.
A similar result can be derived for the case where both A and b are perturbed.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 15 / 1
The Condition Number Again. . .
Since 1 = ‖I‖ = ‖AA−1‖ ≤ ‖A‖‖A−1‖ we have 1‖A‖ ≤ ‖A−1‖.
Thus, if ‖A−1‖ ‖δA‖ � 1 (and so ‖δA‖/‖A‖ � 1),
then 1− ‖A−1‖ ‖δA‖ ≈ 1 and we have
‖δx‖‖x‖ ≤ condrel(A)
‖δA‖‖A‖ .
Thus, for a small perturbation of A, we again have that condition numbermeasures the relative condition of the system of linear equations problem.
A similar result can be derived for the case where both A and b are perturbed.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 15 / 1
“How Close to Singular?”
Theorem
If A is non-singular, and‖δA‖‖A‖ <
1
condrel(A)
then A + δA is also non-singular.
This theorem tells us that the condition number measures the distance from A tothe nearest singular matrix:it is a better measure than the determinant of“how close to singularity” a matrix is.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 16 / 1
“How Close to Singular?”
Theorem
If A is non-singular, and‖δA‖‖A‖ <
1
condrel(A)
then A + δA is also non-singular.
This theorem tells us that the condition number measures the distance from A tothe nearest singular matrix:it is a better measure than the determinant of“how close to singularity” a matrix is.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 16 / 1
“How Close to Singular?”
Theorem
If A is non-singular, and‖δA‖‖A‖ <
1
condrel(A)
then A + δA is also non-singular.
This theorem tells us that the condition number measures the distance from A tothe nearest singular matrix:it is a better measure than the determinant of“how close to singularity” a matrix is.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 16 / 1
“How Close to Singular?”
Theorem
If A is non-singular, and‖δA‖‖A‖ <
1
condrel(A)
then A + δA is also non-singular.
This theorem tells us that the condition number measures the distance from A tothe nearest singular matrix:it is a better measure than the determinant of“how close to singularity” a matrix is.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 16 / 1
Errors and Residuals
There are two common ways to measure the discrepancy between the truesolution x and the computed solution x :
Error δx = x − x
Residual r = b − Ax
If A is invertible, and either δx or r is zero, then both must be zero.In many applications, we want to solve Ax = b so that r , the difference betweenthe LHS and RHS, is small, i.e., so that ‖r‖ = ‖b − Ax‖ is small.
Intuitively, we can think of the residual as follows:if you have a computed solution x to a system of linear equations and you knowthe exact solution x , then you know the error δx = x − x ;but if you don’t know the solution x beforehand then the residual is a measurealong a different axis of how close you are, r = b − Ax .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 17 / 1
Errors and Residuals
There are two common ways to measure the discrepancy between the truesolution x and the computed solution x :
Error δx = x − x
Residual r = b − Ax
If A is invertible, and either δx or r is zero, then both must be zero.In many applications, we want to solve Ax = b so that r , the difference betweenthe LHS and RHS, is small, i.e., so that ‖r‖ = ‖b − Ax‖ is small.
Intuitively, we can think of the residual as follows:if you have a computed solution x to a system of linear equations and you knowthe exact solution x , then you know the error δx = x − x ;but if you don’t know the solution x beforehand then the residual is a measurealong a different axis of how close you are, r = b − Ax .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 17 / 1
Errors and Residuals
There are two common ways to measure the discrepancy between the truesolution x and the computed solution x :
Error δx = x − x
Residual r = b − Ax
If A is invertible, and either δx or r is zero, then both must be zero.In many applications, we want to solve Ax = b so that r , the difference betweenthe LHS and RHS, is small, i.e., so that ‖r‖ = ‖b − Ax‖ is small.
Intuitively, we can think of the residual as follows:if you have a computed solution x to a system of linear equations and you knowthe exact solution x , then you know the error δx = x − x ;but if you don’t know the solution x beforehand then the residual is a measurealong a different axis of how close you are, r = b − Ax .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 17 / 1
Errors and Residuals
There are two common ways to measure the discrepancy between the truesolution x and the computed solution x :
Error δx = x − x
Residual r = b − Ax
If A is invertible, and either δx or r is zero, then both must be zero.In many applications, we want to solve Ax = b so that r , the difference betweenthe LHS and RHS, is small, i.e., so that ‖r‖ = ‖b − Ax‖ is small.
Intuitively, we can think of the residual as follows:if you have a computed solution x to a system of linear equations and you knowthe exact solution x , then you know the error δx = x − x ;but if you don’t know the solution x beforehand then the residual is a measurealong a different axis of how close you are, r = b − Ax .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 17 / 1
Errors and Residuals
There are two common ways to measure the discrepancy between the truesolution x and the computed solution x :
Error δx = x − x
Residual r = b − Ax
If A is invertible, and either δx or r is zero, then both must be zero.In many applications, we want to solve Ax = b so that r , the difference betweenthe LHS and RHS, is small, i.e., so that ‖r‖ = ‖b − Ax‖ is small.
Intuitively, we can think of the residual as follows:if you have a computed solution x to a system of linear equations and you knowthe exact solution x , then you know the error δx = x − x ;but if you don’t know the solution x beforehand then the residual is a measurealong a different axis of how close you are, r = b − Ax .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 17 / 1
Cond(A), r and Errors in xLet x be the computed solution to Ax = b.Then δx = x − x and r = b − Ax , giving
Aδx = Ax − Ax = b − Ax = r
⇒ δx = A−1r .
Thus ‖δx‖ ≤ ‖A−1‖ ‖r‖ (property of matrix norms) [1].
Similarly ‖b‖ ≤ ‖A‖ ‖x‖
so1
‖x‖ ≤‖A‖‖b‖ [2].
It follows that‖δx‖‖x‖ ≤ ‖A‖ ‖A
−1‖ ‖r‖‖b‖ (combining [1] and [2]).
Thus‖δx‖‖x‖ ≤ condrel(A)
‖r‖‖b‖
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 18 / 1
The Effect of Ill-conditioned A
The conclusion is:if A is ill-conditioned then small ‖r‖ does not imply small ‖δx‖/‖x‖.(We’ll see there is a similar conclusion for solutions of non-linear equations:if the problem is ill-conditioned,then a small “residual” |f (xk)| does not mean that |xk − xk−1| is small.)
Solutions with small residuals,these can lead to large errors in x if A is ill-conditioned.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 19 / 1
The Effect of Ill-conditioned A
The conclusion is:if A is ill-conditioned then small ‖r‖ does not imply small ‖δx‖/‖x‖.(We’ll see there is a similar conclusion for solutions of non-linear equations:if the problem is ill-conditioned,then a small “residual” |f (xk)| does not mean that |xk − xk−1| is small.)
Solutions with small residuals,these can lead to large errors in x if A is ill-conditioned.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 19 / 1
The Effect of Ill-conditioned A
The conclusion is:if A is ill-conditioned then small ‖r‖ does not imply small ‖δx‖/‖x‖.(We’ll see there is a similar conclusion for solutions of non-linear equations:if the problem is ill-conditioned,then a small “residual” |f (xk)| does not mean that |xk − xk−1| is small.)
Solutions with small residuals,these can lead to large errors in x if A is ill-conditioned.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 19 / 1
The Effect of Ill-conditioned A
The conclusion is:if A is ill-conditioned then small ‖r‖ does not imply small ‖δx‖/‖x‖.(We’ll see there is a similar conclusion for solutions of non-linear equations:if the problem is ill-conditioned,then a small “residual” |f (xk)| does not mean that |xk − xk−1| is small.)
Solutions with small residuals,these can lead to large errors in x if A is ill-conditioned.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 19 / 1
Special SystemsLet us consider the following types of matrices:
Symmetric: A = AT
Positive definite: xTAx > 0 for all x 6= 0 and therefore xTAx = λxT x > 0and eigenvalues λ > 0.Diagonally dominant (DD): The element on the diagonal is larger or equal tothe sum of the other elements in the row, i.e., |aii | ≥
∑j 6=i |aij |.
Strictly DD: The same, except for the strict inequality, i.e., |aii | >∑
j 6=i |aij |.Upper triangular: for aii 6= 0
a11 a12 · · · · · · a1n
0 a22...
... 0. . .
......
.... . .
...0 0 ann
Recall that if a matrix is in echelon form (e.g., upper triangular)the first non-zero entry in a row is called the pivot for that row:here akk is the pivot for the kth row.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 20 / 1
Special SystemsLet us consider the following types of matrices:
Symmetric: A = AT
Positive definite: xTAx > 0 for all x 6= 0 and therefore xTAx = λxT x > 0and eigenvalues λ > 0.Diagonally dominant (DD): The element on the diagonal is larger or equal tothe sum of the other elements in the row, i.e., |aii | ≥
∑j 6=i |aij |.
Strictly DD: The same, except for the strict inequality, i.e., |aii | >∑
j 6=i |aij |.Upper triangular: for aii 6= 0
a11 a12 · · · · · · a1n
0 a22...
... 0. . .
......
.... . .
...0 0 ann
Recall that if a matrix is in echelon form (e.g., upper triangular)the first non-zero entry in a row is called the pivot for that row:here akk is the pivot for the kth row.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 20 / 1
An Overview of the Algorithms
Direct methods for solving Ax = b apply elementary matrix operations to A andb, giving a transformed problem A′x ′ = b′ which is easily solved for x ′. Withindirect methods:
In Gauss-Jordan, multiples of a pivot row are subtracted from other rows,such that one obtains an upper triangular matrix first, and an identity matrixnext. Gauss-Jordan works (with appropriate pivot) on any matrix, but isstable only for diagonally dominant or positive-definite matrices.
Gauss-Jordan is also closely related to the LU and LUP decomposition, whereU stands for upper triangular matrix and L stands for upper triangular matrix.
On symmetric positive definite matices, one can also use other decompositionmethods (e.g., Cholesky, QR), which are stable and faster.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 21 / 1
An Overview of the Algorithms
Iterative methods successively improve an initial guess until it becomessatisfactory.
Iterative methods for systems of linear equations are best understood as means ofsolving an associated optimisation problem.
Let us have a quadric f := 12x
TAx + bT x + c with A positive definite. Wheneverthe first-order optimality conditions of minx∈Rn f (x) are satisfied, we have Ax = b.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 22 / 1
An Overview of the Algorithms
Iterative methods successively improve an initial guess until it becomessatisfactory.
Iterative methods for systems of linear equations are best understood as means ofsolving an associated optimisation problem.
Let us have a quadric f := 12x
TAx + bT x + c with A positive definite. Wheneverthe first-order optimality conditions of minx∈Rn f (x) are satisfied, we have Ax = b.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 22 / 1
An Overview of the Algorithms
Iterative methods successively improve an initial guess until it becomessatisfactory.
Iterative methods for systems of linear equations are best understood as means ofsolving an associated optimisation problem.
Let us have a quadric f := 12x
TAx + bT x + c with A positive definite. Wheneverthe first-order optimality conditions of minx∈Rn f (x) are satisfied, we have Ax = b.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 22 / 1
An Overview of the Algorithms
Within iterative methods:
Jacobi method is guaranteed to converge if A is strictly diagonally dominant.
Gauss-Seidel is guaranteed to converge if A is either diagonally dominant orsymmetric positive semidefinite.
Many other algorithms (CG, GMRES) work on symmetric positive-definitematrices.
In a number of applications, iterative methods are preferred to direct methods,especially when the coefficient matrix A is sparse or structured.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 23 / 1
Gauss-JordanRecall that this method uses a sequence of elementary matrix operations totransform the square system Ax = b into an upper triangular system Ux = b′,which is then solved using back substitution.
We use a superscript in parentheses to denote the stage: x(k)i denotes the value
for xi at the kth stage and A(k) denotes the matrix A at this stage.
At stage k we have:
a(1)11 a
(1)12 · · · a
(1)1k · · · a
(1)1n b
(1)1
0 a(2)22 · · · a
(2)2k · · · a
(2)2n b
(2)2
.... . .
...
0 · · · · · · a(k)kk · · · a
(k)kn b
(k)k
......
0 · · · · · · a(k)nk · · · a
(k)nn b
(k)n
=(A(k) b(k)
)
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 24 / 1
Gauss-JordanRecall that this method uses a sequence of elementary matrix operations totransform the square system Ax = b into an upper triangular system Ux = b′,which is then solved using back substitution.
We use a superscript in parentheses to denote the stage: x(k)i denotes the value
for xi at the kth stage and A(k) denotes the matrix A at this stage.
At stage k we have:
a(1)11 a
(1)12 · · · a
(1)1k · · · a
(1)1n b
(1)1
0 a(2)22 · · · a
(2)2k · · · a
(2)2n b
(2)2
.... . .
...
0 · · · · · · a(k)kk · · · a
(k)kn b
(k)k
......
0 · · · · · · a(k)nk · · · a
(k)nn b
(k)n
=(A(k) b(k)
)
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 24 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
What Gauss-Jordan does at Stage kThe elements a
(k)k+1,k , a
(k)k+2,k ,. . . , a
(k)nk are eliminated by subtracting the following
multiples of row k from rows k + 1, k + 2, . . . , n:
mk+1,k :=a
(k)k+1,k
a(k)kk
, mk+2,k :=a
(k)k+2,k
a(k)kk
, . . . , mn,k :=a
(k)n,k
a(k)kk
.
We have in general, assuming that a(k)kk 6= 0, the (i , k) multiplier
mik :=a
(k)ik
a(k)kk
i = k + 1, . . . , n
and, for all i , j = k + 1, . . . , n,
a(k+1)ij = a
(k)ij −mika
(k)kj ,
b(k+1)i = b
(k)i −mikb
(k)k .
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 25 / 1
A Picture of the Matrix at Stage kNote that rows 1, . . . , k will not change from stage k + 1 onwards.
akk
Reduce to zero
part of matrix that changes
Figure : Gauss-Jordan: changes at stage k.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 26 / 1
Gauss-Jordan1 def GaussJordan(A, b, pivoting = noPivot):
(rows, cols) = A.shapefor row in range(0, rows-1):pivot = pivoting(A, row)if abs(A[pivot, row]) < 1e-8: raise ValueError()
6 if pivot != row:A[[row, pivot],:] = A[[pivot, row],:]b[[row, pivot]] = b[[pivot, row]]
for i in range(row+1, rows):if abs(A[row, row]) < 1e-8: raise ValueError()
11 factor = A[i, row] / A[row, row]A[i, row+1:rows] = A[i, row+1:rows] -
factor*A[row, row+1:rows]b[i] = b[i] - factor*b[row]
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 27 / 1
Gauss-Jordan
The back-substitution can be written as:
for k in range(rows-1,-1,-1):2 b[k] = (b[k] - dot(A[k, k+1:rows], b[k+1:rows])) /
A[k, k]return b
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 28 / 1
Analysis of Gauss-JordanSee that Line 12 performs O(n) “multiply–accumulate” operations for n rows andn. If we see “multiplyaccumulate” as 1 operation, the number S(n) of operationsperformed is:
S(n) =n−1∑k=1
n∑i=k+1
n∑j=k+1
1
=n−1∑k=1
n∑i=k+1
(n − k)
=n−1∑k=1
(n − k)2
= (n − 1)2 + (n − 2)2 + · · ·+ 22 + 12
= n(n − 1)(2n − 1)/6 ≈ n3/3 for large n.
Hence Gauss-Jordan is a Θ(n3) process. (∑n−1
k=1 k2 = 1
6n(n − 1)(2n − 1) byinduction.)
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 29 / 1
Analysis of Gauss-JordanSee that Line 12 performs O(n) “multiply–accumulate” operations for n rows andn. If we see “multiplyaccumulate” as 1 operation, the number S(n) of operationsperformed is:
S(n) =n−1∑k=1
n∑i=k+1
n∑j=k+1
1
=n−1∑k=1
n∑i=k+1
(n − k)
=n−1∑k=1
(n − k)2
= (n − 1)2 + (n − 2)2 + · · ·+ 22 + 12
= n(n − 1)(2n − 1)/6 ≈ n3/3 for large n.
Hence Gauss-Jordan is a Θ(n3) process. (∑n−1
k=1 k2 = 1
6n(n − 1)(2n − 1) byinduction.)
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 29 / 1
A Perspective on Gauss-Jordan
To put Θ(n3) into perspective, consider a single computer, which can sustain theperformance of 1011 operations per second (“100 gigaFLOPS”).
For a 10000× 10000 matrix, you need 1012 operations, or 10 seconds.
For a 100000× 100000 matrix, you need 1015 operations, or under 3 hours, if youcan store the 80 GB in RAM.
For a 1000000× 1000000 matrix, you need 1018 operations, or over 115 days, ifyou can store the 8 TB in RAM.
As you can test using your own laptopn, this is a very optimistic estimate.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 30 / 1
A Perspective on Gauss-Jordan
To put Θ(n3) into perspective, consider a single computer, which can sustain theperformance of 1011 operations per second (“100 gigaFLOPS”).
For a 10000× 10000 matrix, you need 1012 operations, or 10 seconds.
For a 100000× 100000 matrix, you need 1015 operations, or under 3 hours, if youcan store the 80 GB in RAM.
For a 1000000× 1000000 matrix, you need 1018 operations, or over 115 days, ifyou can store the 8 TB in RAM.
As you can test using your own laptopn, this is a very optimistic estimate.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 30 / 1
A Perspective on Gauss-Jordan
To put Θ(n3) into perspective, consider a single computer, which can sustain theperformance of 1011 operations per second (“100 gigaFLOPS”).
For a 10000× 10000 matrix, you need 1012 operations, or 10 seconds.
For a 100000× 100000 matrix, you need 1015 operations, or under 3 hours, if youcan store the 80 GB in RAM.
For a 1000000× 1000000 matrix, you need 1018 operations, or over 115 days, ifyou can store the 8 TB in RAM.
As you can test using your own laptopn, this is a very optimistic estimate.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 30 / 1
A Perspective on Gauss-Jordan
To put Θ(n3) into perspective, consider a single computer, which can sustain theperformance of 1011 operations per second (“100 gigaFLOPS”).
For a 10000× 10000 matrix, you need 1012 operations, or 10 seconds.
For a 100000× 100000 matrix, you need 1015 operations, or under 3 hours, if youcan store the 80 GB in RAM.
For a 1000000× 1000000 matrix, you need 1018 operations, or over 115 days, ifyou can store the 8 TB in RAM.
As you can test using your own laptopn, this is a very optimistic estimate.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 30 / 1
A Perspective on Gauss-Jordan
To put Θ(n3) into perspective, consider a single computer, which can sustain theperformance of 1011 operations per second (“100 gigaFLOPS”).
For a 10000× 10000 matrix, you need 1012 operations, or 10 seconds.
For a 100000× 100000 matrix, you need 1015 operations, or under 3 hours, if youcan store the 80 GB in RAM.
For a 1000000× 1000000 matrix, you need 1018 operations, or over 115 days, ifyou can store the 8 TB in RAM.
As you can test using your own laptopn, this is a very optimistic estimate.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 30 / 1
The Net Effect. . .
Gauss-Jordan transforms the original system Ax = b to upper triangular form:
Ux =
a
(1)11 a
(1)12 · · · a
(1)1n
0 a(2)22
......
. . ....
0 0 · · · a(n)nn
x1
x2...xn
=
b
(1)1
b(2)2...
b(n)n
This system of equations can now be solved using back substitution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 31 / 1
The Net Effect. . .
Gauss-Jordan transforms the original system Ax = b to upper triangular form:
Ux =
a
(1)11 a
(1)12 · · · a
(1)1n
0 a(2)22
......
. . ....
0 0 · · · a(n)nn
x1
x2...xn
=
b
(1)1
b(2)2...
b(n)n
This system of equations can now be solved using back substitution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 31 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Observations on Gauss-Jordan
Assumes a(k)kk 6= 0: but in fact since A is invertible, we could always swap row
k with a later row to get a(k)kk 6= 0 (see later).
A and b are overwritten.
The 0’s beneath the pivot element are not calculated.They are ignored, as they are known to be zero.
Thus the storage space for these zeros could be used for something else. . .
An extra matrix is not needed to store the mik ’s.They can be stored in place of the zeros.
The operations on b can be done separately, once we have stored the mik ’s.
Because of the last observation we may now solve for any b without goingthrough the elimination calculations again.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 32 / 1
Gauss-Jordan with Varying b′
We solved Ax = b using Gauss-Jordan which required elementary row operationsto be performed on both A and b.
If we are required to solve the equation Ax = b′ then we would need to performexactly the same operations because these are determined by the elements of Aonly, and A is the same in both equations.
Hence if we have stored the multipliers mik we need to perform only thelast-but-one line of Gauss-Jordan, i.e.,
bi := bi −mikbk , k = 1, . . . , n − 1, i = k + 1, . . . , n.
Gauss-Jordan with Varying b′
We solved Ax = b using Gauss-Jordan which required elementary row operationsto be performed on both A and b.
If we are required to solve the equation Ax = b′ then we would need to performexactly the same operations because these are determined by the elements of Aonly, and A is the same in both equations.
Hence if we have stored the multipliers mik we need to perform only thelast-but-one line of Gauss-Jordan, i.e.,
bi := bi −mikbk , k = 1, . . . , n − 1, i = k + 1, . . . , n.
Gauss-Jordan with Varying b′
We solved Ax = b using Gauss-Jordan which required elementary row operationsto be performed on both A and b.
If we are required to solve the equation Ax = b′ then we would need to performexactly the same operations because these are determined by the elements of Aonly, and A is the same in both equations.
Hence if we have stored the multipliers mik we need to perform only thelast-but-one line of Gauss-Jordan, i.e.,
bi := bi −mikbk , k = 1, . . . , n − 1, i = k + 1, . . . , n.
Gauss-Jordan with Varying b′
We solved Ax = b using Gauss-Jordan which required elementary row operationsto be performed on both A and b.
If we are required to solve the equation Ax = b′ then we would need to performexactly the same operations because these are determined by the elements of Aonly, and A is the same in both equations.
Hence if we have stored the multipliers mik we need to perform only thelast-but-one line of Gauss-Jordan, i.e.,
bi := bi −mikbk , k = 1, . . . , n − 1, i = k + 1, . . . , n.
The LU Decomposition of A
If at each stage k of Gauss-Jordan we store mik in those cells of A that becomezero then the A matrix after elimination would be as follows
a(1)11 a
(1)12 · · · a
(1)1n
m21 a(2)22
......
. . ....
mn1 mn2 · · · a(n)nn
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 34 / 1
LU Decomposition of AWe define the upper and unit lower triangular parts as
U = (uij) =
a
(1)11 a
(1)12 · · · a
(1)1n
0 a(2)22
......
. . ....
0 0 · · · a(n)nn
, L = (`ij) =
1 0 · · · 0
m21 1...
.... . .
...mn1 mn2 · · · 1
.
That is, for all i , j ∈ {1, . . . , n},
uij =
{a
(i)ij if i ≤ j
0 otherwise
`ij =
mij if i > j1 if i = j0 otherwise
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 35 / 1
LU Decomposition of AWe define the upper and unit lower triangular parts as
U = (uij) =
a
(1)11 a
(1)12 · · · a
(1)1n
0 a(2)22
......
. . ....
0 0 · · · a(n)nn
, L = (`ij) =
1 0 · · · 0
m21 1...
.... . .
...mn1 mn2 · · · 1
.
That is, for all i , j ∈ {1, . . . , n},
uij =
{a
(i)ij if i ≤ j
0 otherwise
`ij =
mij if i > j1 if i = j0 otherwise
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 35 / 1
LU Decomposition of AWe define the upper and unit lower triangular parts as
U = (uij) =
a
(1)11 a
(1)12 · · · a
(1)1n
0 a(2)22
......
. . ....
0 0 · · · a(n)nn
, L = (`ij) =
1 0 · · · 0
m21 1...
.... . .
...mn1 mn2 · · · 1
.
That is, for all i , j ∈ {1, . . . , n},
uij =
{a
(i)ij if i ≤ j
0 otherwise
`ij =
mij if i > j1 if i = j0 otherwise
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 35 / 1
LU Decomposition of AWe define the upper and unit lower triangular parts as
U = (uij) =
a
(1)11 a
(1)12 · · · a
(1)1n
0 a(2)22
......
. . ....
0 0 · · · a(n)nn
, L = (`ij) =
1 0 · · · 0
m21 1...
.... . .
...mn1 mn2 · · · 1
.
That is, for all i , j ∈ {1, . . . , n},
uij =
{a
(i)ij if i ≤ j
0 otherwise
`ij =
mij if i > j1 if i = j0 otherwise
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 35 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
An Unexpected Fact: A = LUTheorem (LU Decomposition)
If L = (`ij) and U = (uij) are the upper and lower triangular matrices generated
by Gauss-Jordan, assuming a(k)kk 6= 0 at each stage, then
A = (aij) = LU, that is, aij =n∑
k=1
`ikukj
whereukj = a
(k)kj , k ≤ j , in particular, ukk = a
(k)kk
and`ik = mik , k ≤ i , `kk = 1,
and this decomposition is unique.
For proof, c.f., (Watkins, 2004, 51–53)Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 36 / 1
A Reinterpretation of Gauss-Jordan
We can now interpret Gauss-Jordan as a process which decomposes A into L andU and hence we have
Ax = LUx = L(Ux) = Ly = b.
This represents two triangular systems of equations
Ly = b and Ux = y
whose solutions are:
y = L−1b, Ux = L−1b, x = U−1L−1b.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 37 / 1
A Reinterpretation of Gauss-Jordan
We can now interpret Gauss-Jordan as a process which decomposes A into L andU and hence we have
Ax = LUx = L(Ux) = Ly = b.
This represents two triangular systems of equations
Ly = b and Ux = y
whose solutions are:
y = L−1b, Ux = L−1b, x = U−1L−1b.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 37 / 1
A Reinterpretation of Gauss-Jordan
We can now interpret Gauss-Jordan as a process which decomposes A into L andU and hence we have
Ax = LUx = L(Ux) = Ly = b.
This represents two triangular systems of equations
Ly = b and Ux = y
whose solutions are:
y = L−1b, Ux = L−1b, x = U−1L−1b.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 37 / 1
Overall, we solve Ly = B for y first (”forward”), and solve y = Ux for x second(”backward”). The revised code is:
def LU(A, b):2 L, U = lu(A, permute_l=True)
y = zeros_like(b)for m, bi in enumerate(b.flatten()):
y[m] = biif m:
7 for n in xrange(m):y[m] -= y[n] * L[m, n]
y[m] /= L[m, m]
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 38 / 1
1 x = zeros_like(b)for midx in xrange(B.size):
m = b.size - 1 - midxx[m] = y[m]if midx:
6 for nidx in xrange(midx):n = b.size - 1 - nidxx[m] -= x[n] * U[m, n]
x[m] /= U[m, m]return x
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 39 / 1
The LDU Decomposition of AGauss-Jordan also provides the decomposition
A = LDU ′,
where L and U ′ are unit lower and unit upper triangularand D = diag(uii ),the diagonal matrix with u11, . . . , unn as the diagonal entries.
To see this decompose A = LU and let U ′ = D−1U.
Since U is non-singular, uii 6= 0, i = 1, 2, . . . , n and hence D−1 exists.
It is easy to show that U ′ := D−1U is a unit upper triangular matrix.
Thus,A = LU = LDD−1U = LDU ′.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 40 / 1
The LDU Decomposition of AGauss-Jordan also provides the decomposition
A = LDU ′,
where L and U ′ are unit lower and unit upper triangularand D = diag(uii ),the diagonal matrix with u11, . . . , unn as the diagonal entries.
To see this decompose A = LU and let U ′ = D−1U.
Since U is non-singular, uii 6= 0, i = 1, 2, . . . , n and hence D−1 exists.
It is easy to show that U ′ := D−1U is a unit upper triangular matrix.
Thus,A = LU = LDD−1U = LDU ′.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 40 / 1
The LDU Decomposition of AGauss-Jordan also provides the decomposition
A = LDU ′,
where L and U ′ are unit lower and unit upper triangularand D = diag(uii ),the diagonal matrix with u11, . . . , unn as the diagonal entries.
To see this decompose A = LU and let U ′ = D−1U.
Since U is non-singular, uii 6= 0, i = 1, 2, . . . , n and hence D−1 exists.
It is easy to show that U ′ := D−1U is a unit upper triangular matrix.
Thus,A = LU = LDD−1U = LDU ′.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 40 / 1
The LDU Decomposition of AGauss-Jordan also provides the decomposition
A = LDU ′,
where L and U ′ are unit lower and unit upper triangularand D = diag(uii ),the diagonal matrix with u11, . . . , unn as the diagonal entries.
To see this decompose A = LU and let U ′ = D−1U.
Since U is non-singular, uii 6= 0, i = 1, 2, . . . , n and hence D−1 exists.
It is easy to show that U ′ := D−1U is a unit upper triangular matrix.
Thus,A = LU = LDD−1U = LDU ′.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 40 / 1
The LDU Decomposition of AGauss-Jordan also provides the decomposition
A = LDU ′,
where L and U ′ are unit lower and unit upper triangularand D = diag(uii ),the diagonal matrix with u11, . . . , unn as the diagonal entries.
To see this decompose A = LU and let U ′ = D−1U.
Since U is non-singular, uii 6= 0, i = 1, 2, . . . , n and hence D−1 exists.
It is easy to show that U ′ := D−1U is a unit upper triangular matrix.
Thus,A = LU = LDD−1U = LDU ′.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 40 / 1
The LDU Decomposition of AGauss-Jordan also provides the decomposition
A = LDU ′,
where L and U ′ are unit lower and unit upper triangularand D = diag(uii ),the diagonal matrix with u11, . . . , unn as the diagonal entries.
To see this decompose A = LU and let U ′ = D−1U.
Since U is non-singular, uii 6= 0, i = 1, 2, . . . , n and hence D−1 exists.
It is easy to show that U ′ := D−1U is a unit upper triangular matrix.
Thus,A = LU = LDD−1U = LDU ′.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 40 / 1
The LDU Decomposition for Special Kinds of A
If A is symmetric thenA = LDU ′ = LDLt ,
where L is unit lower triangular.
If A is symmetric and positive definite (that is, x tAx > 0 for all x)then each uii is positive and
A = LDLt = L√D√DLt = CC t ,
where C = L√D and
√D = diag(
√uii ).
This is called the Cholesky Factorization of A.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 41 / 1
The LDU Decomposition for Special Kinds of A
If A is symmetric thenA = LDU ′ = LDLt ,
where L is unit lower triangular.
If A is symmetric and positive definite (that is, x tAx > 0 for all x)then each uii is positive and
A = LDLt = L√D√DLt = CC t ,
where C = L√D and
√D = diag(
√uii ).
This is called the Cholesky Factorization of A.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 41 / 1
The LDU Decomposition for Special Kinds of A
If A is symmetric thenA = LDU ′ = LDLt ,
where L is unit lower triangular.
If A is symmetric and positive definite (that is, x tAx > 0 for all x)then each uii is positive and
A = LDLt = L√D√DLt = CC t ,
where C = L√D and
√D = diag(
√uii ).
This is called the Cholesky Factorization of A.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 41 / 1
The LDU Decomposition for Special Kinds of A
If A is symmetric thenA = LDU ′ = LDLt ,
where L is unit lower triangular.
If A is symmetric and positive definite (that is, x tAx > 0 for all x)then each uii is positive and
A = LDLt = L√D√DLt = CC t ,
where C = L√D and
√D = diag(
√uii ).
This is called the Cholesky Factorization of A.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 41 / 1
The LDU Decomposition for Special Kinds of A
If A is symmetric thenA = LDU ′ = LDLt ,
where L is unit lower triangular.
If A is symmetric and positive definite (that is, x tAx > 0 for all x)then each uii is positive and
A = LDLt = L√D√DLt = CC t ,
where C = L√D and
√D = diag(
√uii ).
This is called the Cholesky Factorization of A.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 41 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
Pivoting in Gauss-Jordan
In Gauss-Jordan we assumed that a(k)kk 6= 0 at each stage of the process.
If a(k)kk = 0 then we can interchange rows of the matrix A(k) so that a
(k)kk 6= 0.
In fact, we need only find a row i > k for which a(k)kk 6= 0 and then interchange
rows i and k.
It can be easily shown that if A is non-singular then such a row exists.
Hence, theoretically, zero pivots cause no difficulty.
However, there is a much more important reason for interchanging rows:
if a(k)kk is small (even if a
(k)kk 6= 0)
then division by a(k)kk would cause problems because of roundoff.
We can see this in the next example.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 42 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
More Roundoff ErrorThe problem with roundoff in Gauss-Jordan is that it propagates and is amplifiedfrom stage to stagebecause, there is no contraction of error.
Thus, roundoff error control is absolutely essential in Gauss-Jordan.
We indicate approaches to this, Partial Pivoting and Complete Pivoting.
Note: It can be shown that the step A(k) −→ A(k+1) in Gauss-Jordanmay be viewed as multiplication by a matrix M(k),where M(k) is a product of elementary matrices(matrices associated to elementary row operations).
It can be shown that if all the multipliers have magnitude < 1,then the final result will be accurate,as in our second approach to the example.
This is the basic idea of partial pivoting.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 43 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Scaled Partial PivotingScaled partial pivoting is a variation of standard partial pivoting.
In scaled partial pivoting, at stage k , we choose as the pivotthe entry in column kwhich is of greatest absolute value relative to the entries in its row(as before, we only consider rows k, . . . , n).
The scaled pivoting approach is useful when entries have large differences inabsolute value, since this causes propagation of roundoff error.
We use it for systems of linear equations where the row entries vary greatly inmagnitude, e.g., (
10 105 106
1 −1 3
)Here, it is worth transposing the two rows, since the current pivot, 10, is largerthan 1 but is very small relative to the other entries 105 and 106 in the first row.
Without a row swap, roundoff errors will lead to loss of accuracy.Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 44 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k .
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Complete PivotingComplete pivoting (also called maximal pivoting) is a natural extension of partialpivoting whereby we find i∗ and j∗ such that |ai∗j∗ | = max
k≤i ,j≤n|aij |.
This means that we interchange rows i∗ and k and columns j∗ and k .
The row interchange does not have any effect (theoretically) on the solution butthe column interchange interchanges the variable names (labels) i.e., xj∗ ↔ xk .
These interchanges of columns must be recorded so that the correct variable isassociated with the corresponding solution value at the end of the algorithm.
Complete pivoting is an O(n2) process at each stage k.
Thus it adds O(n3) steps to Gauss-Jordan which is a substantial increase,although G.E. is still Θ(n3).
It is rarely used because it has been found in practice that partial pivoting isadequate to ensure numerical stability, except in isolated cases:in these cases, complete pivoting may be needed to attain acceptable accuracy.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 45 / 1
Direct Methods: Conclusions
In theory, the complexity is can be decreased to that of matrix-matrixmultiplication.
Complete pivoting is safe (proven), but so computationally expensive, that itis not used.
Partial pivoting is safe with high probability,particularly if the scaled version is used (experimental result).
In practice, the various decompositions (LU, LDU, LUP, Cholesky, etc), areof particular importance, as they often allow for elegant solutions ofnon-trivial problems.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 46 / 1
Iterative Methods for Solving Systems of Linear Equations
Iterative methods successively improve an initial guess until it becomessatisfactory. The iterative solution of Ax = b requires the equation to bere-arranged into fixed point form as follows:
x = T (x) := Cx + d .
Since subscripts are traditionally used to indicate components of a vector, we willuse a superscript on the vector x to denote the iteration:
xk is the kth “guess” or iteration of the solution vector x .
Then xki denotes the value for the i th component xi at the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 47 / 1
Iterative Methods for Solving Systems of Linear Equations
Iterative methods successively improve an initial guess until it becomessatisfactory. The iterative solution of Ax = b requires the equation to bere-arranged into fixed point form as follows:
x = T (x) := Cx + d .
Since subscripts are traditionally used to indicate components of a vector, we willuse a superscript on the vector x to denote the iteration:
xk is the kth “guess” or iteration of the solution vector x .
Then xki denotes the value for the i th component xi at the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 47 / 1
Iterative Methods for Solving Systems of Linear Equations
Iterative methods successively improve an initial guess until it becomessatisfactory. The iterative solution of Ax = b requires the equation to bere-arranged into fixed point form as follows:
x = T (x) := Cx + d .
Since subscripts are traditionally used to indicate components of a vector, we willuse a superscript on the vector x to denote the iteration:
xk is the kth “guess” or iteration of the solution vector x .
Then xki denotes the value for the i th component xi at the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 47 / 1
Iterative Methods for Solving Systems of Linear Equations
Iterative methods successively improve an initial guess until it becomessatisfactory. The iterative solution of Ax = b requires the equation to bere-arranged into fixed point form as follows:
x = T (x) := Cx + d .
Since subscripts are traditionally used to indicate components of a vector, we willuse a superscript on the vector x to denote the iteration:
xk is the kth “guess” or iteration of the solution vector x .
Then xki denotes the value for the i th component xi at the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 47 / 1
A Revision
The convergence of of is usually restricted to diagonally dominant matrices,because:
T is a contraction mapping ⇐⇒ the spectral radius r(C ) < 1, which is theabsolute value of C ’s largest eigenvalue.
A sufficient condition for this is that:for some matrix norm ‖ ‖, we have ‖C‖ < 1. This is the case for strictlydiagonally dominant matrices.
Then Banach’s Fixed Point Theorem tells us thatthe sequence (xk) defined by xk+1 := T (xk)will converge to a unique limit x , the solution of Ax = b.
A Revision
The convergence of of is usually restricted to diagonally dominant matrices,because:
T is a contraction mapping ⇐⇒ the spectral radius r(C ) < 1, which is theabsolute value of C ’s largest eigenvalue.
A sufficient condition for this is that:for some matrix norm ‖ ‖, we have ‖C‖ < 1. This is the case for strictlydiagonally dominant matrices.
Then Banach’s Fixed Point Theorem tells us thatthe sequence (xk) defined by xk+1 := T (xk)will converge to a unique limit x , the solution of Ax = b.
Order of Convergence
Assume the sequence (xk) converges to the fixed point x and defineek+1 = x − xk+1, the error at the kth iteration.
Then we have
x − xk+1 = x − (Cxk + d)
= Cx + d − (Cxk + d) since x is a fixed point
= C (x − xk) by linearity of matrix multiplication.
Hence ‖ek+1‖ ≤ ‖C‖ ‖ek‖, i.e., linear order of convergence.
It is obvious that the smaller ‖C‖ is, the faster the iterations converge to asolution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 49 / 1
Order of Convergence
Assume the sequence (xk) converges to the fixed point x and defineek+1 = x − xk+1, the error at the kth iteration.
Then we have
x − xk+1 = x − (Cxk + d)
= Cx + d − (Cxk + d) since x is a fixed point
= C (x − xk) by linearity of matrix multiplication.
Hence ‖ek+1‖ ≤ ‖C‖ ‖ek‖, i.e., linear order of convergence.
It is obvious that the smaller ‖C‖ is, the faster the iterations converge to asolution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 49 / 1
Order of Convergence
Assume the sequence (xk) converges to the fixed point x and defineek+1 = x − xk+1, the error at the kth iteration.
Then we have
x − xk+1 = x − (Cxk + d)
= Cx + d − (Cxk + d) since x is a fixed point
= C (x − xk) by linearity of matrix multiplication.
Hence ‖ek+1‖ ≤ ‖C‖ ‖ek‖, i.e., linear order of convergence.
It is obvious that the smaller ‖C‖ is, the faster the iterations converge to asolution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 49 / 1
Order of Convergence
Assume the sequence (xk) converges to the fixed point x and defineek+1 = x − xk+1, the error at the kth iteration.
Then we have
x − xk+1 = x − (Cxk + d)
= Cx + d − (Cxk + d) since x is a fixed point
= C (x − xk) by linearity of matrix multiplication.
Hence ‖ek+1‖ ≤ ‖C‖ ‖ek‖, i.e., linear order of convergence.
It is obvious that the smaller ‖C‖ is, the faster the iterations converge to asolution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 49 / 1
Order of Convergence
Assume the sequence (xk) converges to the fixed point x and defineek+1 = x − xk+1, the error at the kth iteration.
Then we have
x − xk+1 = x − (Cxk + d)
= Cx + d − (Cxk + d) since x is a fixed point
= C (x − xk) by linearity of matrix multiplication.
Hence ‖ek+1‖ ≤ ‖C‖ ‖ek‖, i.e., linear order of convergence.
It is obvious that the smaller ‖C‖ is, the faster the iterations converge to asolution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 49 / 1
Order of Convergence
Assume the sequence (xk) converges to the fixed point x and defineek+1 = x − xk+1, the error at the kth iteration.
Then we have
x − xk+1 = x − (Cxk + d)
= Cx + d − (Cxk + d) since x is a fixed point
= C (x − xk) by linearity of matrix multiplication.
Hence ‖ek+1‖ ≤ ‖C‖ ‖ek‖, i.e., linear order of convergence.
It is obvious that the smaller ‖C‖ is, the faster the iterations converge to asolution.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 49 / 1
Transforming Ax = b to x = Cx + d
A can be split to rewrite Ax = b in fixed point form x = Cx + d in a number ofways, incl. Jacobi and Gauss-Seidel.
In both cases, because of the way C is derived from A, it turns out that if A isdiagonally dominant, so is C : thus, if ‖A‖1 < 1 or ‖A‖∞ < 1, then C also hasnorm < 1and our sufficient condition for convergence of the sequence (xk) holds true.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 50 / 1
Transforming Ax = b to x = Cx + d
A can be split to rewrite Ax = b in fixed point form x = Cx + d in a number ofways, incl. Jacobi and Gauss-Seidel.
In both cases, because of the way C is derived from A, it turns out that if A isdiagonally dominant, so is C : thus, if ‖A‖1 < 1 or ‖A‖∞ < 1, then C also hasnorm < 1and our sufficient condition for convergence of the sequence (xk) holds true.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 50 / 1
Transforming Ax = b to x = Cx + d
A can be split to rewrite Ax = b in fixed point form x = Cx + d in a number ofways, incl. Jacobi and Gauss-Seidel.
In both cases, because of the way C is derived from A, it turns out that if A isdiagonally dominant, so is C : thus, if ‖A‖1 < 1 or ‖A‖∞ < 1, then C also hasnorm < 1and our sufficient condition for convergence of the sequence (xk) holds true.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 50 / 1
Jacobi MethodThis splits A as follows:
Ax = (A− D + D)x = b,
where D is diagonal formed from the diagonal elements of A. This leads to
C = −D−1(A− D) and d = +D−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxkj −
n∑j=i+1
aijxkj
This iteration formula can be written in correction form as: for i := 1 to n do
xk+1i := xki +
1
aii
bi −n∑
j=1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 51 / 1
Jacobi MethodThis splits A as follows:
Ax = (A− D + D)x = b,
where D is diagonal formed from the diagonal elements of A. This leads to
C = −D−1(A− D) and d = +D−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxkj −
n∑j=i+1
aijxkj
This iteration formula can be written in correction form as: for i := 1 to n do
xk+1i := xki +
1
aii
bi −n∑
j=1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 51 / 1
Jacobi MethodThis splits A as follows:
Ax = (A− D + D)x = b,
where D is diagonal formed from the diagonal elements of A. This leads to
C = −D−1(A− D) and d = +D−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxkj −
n∑j=i+1
aijxkj
This iteration formula can be written in correction form as: for i := 1 to n do
xk+1i := xki +
1
aii
bi −n∑
j=1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 51 / 1
Jacobi MethodThis splits A as follows:
Ax = (A− D + D)x = b,
where D is diagonal formed from the diagonal elements of A. This leads to
C = −D−1(A− D) and d = +D−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxkj −
n∑j=i+1
aijxkj
This iteration formula can be written in correction form as: for i := 1 to n do
xk+1i := xki +
1
aii
bi −n∑
j=1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 51 / 1
Jacobi Method
In terms of code:
def Jacobi(A, b, tol = 1e-10, limit = 100):x = zeros_like(b)for iteration in range(limit):next = zeros_like(x)
5 for i in range(A.shape[0]):s1 = dot(A[i, :i], x[:i])s2 = dot(A[i, i + 1:], x[i + 1:])next[i] = (b[i] - s1 - s2) / A[i, i]
if allclose(x, next, atol=tol): break10 x = next
return x
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 52 / 1
Gauss-Seidel MethodThis splits A as follows:
Ax = (L + D + U)x = b,
where L, U and D are the matrices formed from the sub-, super-, and diagonalelements of A, respectively. This leads to
C = −(D + L)−1U and d = (D + L)−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i+1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 53 / 1
Gauss-Seidel MethodThis splits A as follows:
Ax = (L + D + U)x = b,
where L, U and D are the matrices formed from the sub-, super-, and diagonalelements of A, respectively. This leads to
C = −(D + L)−1U and d = (D + L)−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i+1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 53 / 1
Gauss-Seidel MethodThis splits A as follows:
Ax = (L + D + U)x = b,
where L, U and D are the matrices formed from the sub-, super-, and diagonalelements of A, respectively. This leads to
C = −(D + L)−1U and d = (D + L)−1b.
Each component of the new vector xk+1 can be calculated using A and b: fori := 1 to n do
xk+1i :=
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i+1
aijxkj
.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 53 / 1
Gauss-Seidel Method
In terms of code:
def GaussSeidel(A, b, tol = 1e-10, limit = 100):x = zeros_like(b)for iteration in range(limit):
4 next = zeros_like(x)for i in range(A.shape[0]):
s1 = dot(A[i, :i], next[:i])s2 = dot(A[i, i + 1:], x[i + 1:])next[i] = (b[i] - s1 - s2) / A[i, i]
9 if allclose(x, next, rtol=tol): breakx = next
return x
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 54 / 1
Contrasting Jacobi and Gauss-SeidelGauss-Seidel uses a new component of x as soon as it becomes available,in contrast to the Jacobi method,which waits for all n new components before using any of them.
The correction form of the Gauss-Seidel iteration formula is
for i := 1 to n do
xk+1i := xki +
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i
aijxkj
.
In vector-matrix form this is
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1,
where rk,k+1 is the ‘residual’ after the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 55 / 1
Contrasting Jacobi and Gauss-SeidelGauss-Seidel uses a new component of x as soon as it becomes available,in contrast to the Jacobi method,which waits for all n new components before using any of them.
The correction form of the Gauss-Seidel iteration formula is
for i := 1 to n do
xk+1i := xki +
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i
aijxkj
.
In vector-matrix form this is
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1,
where rk,k+1 is the ‘residual’ after the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 55 / 1
Contrasting Jacobi and Gauss-SeidelGauss-Seidel uses a new component of x as soon as it becomes available,in contrast to the Jacobi method,which waits for all n new components before using any of them.
The correction form of the Gauss-Seidel iteration formula is
for i := 1 to n do
xk+1i := xki +
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i
aijxkj
.
In vector-matrix form this is
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1,
where rk,k+1 is the ‘residual’ after the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 55 / 1
Contrasting Jacobi and Gauss-SeidelGauss-Seidel uses a new component of x as soon as it becomes available,in contrast to the Jacobi method,which waits for all n new components before using any of them.
The correction form of the Gauss-Seidel iteration formula is
for i := 1 to n do
xk+1i := xki +
1
aii
bi −i−1∑j=1
aijxk+1j −
n∑j=i
aijxkj
.
In vector-matrix form this is
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1,
where rk,k+1 is the ‘residual’ after the kth iteration.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 55 / 1
Comparison of Iterative Methods
All have first order convergence, i.e., ‖ek+1‖ ≤ ‖C‖ ‖ek‖,where C depends on the method used.
The similarities between the methods can be seen most easily if we write them inmatrix correction form:
xk+1 = xk + D−1(b − Axk) = xk + D−1rk
(Jacobi: here rk is the residual after the kth iteration);
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1
(Gauss-Seidel: here rk,k+1 is the ‘residual’ after the kth iteration).
Thus the Jacobi and Gauss-Seidel use different approximations to the A−1 matrix.
In both cases, the rate of convergence slows down, as the the condition numberincreases.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 56 / 1
Comparison of Iterative Methods
All have first order convergence, i.e., ‖ek+1‖ ≤ ‖C‖ ‖ek‖,where C depends on the method used.
The similarities between the methods can be seen most easily if we write them inmatrix correction form:
xk+1 = xk + D−1(b − Axk) = xk + D−1rk
(Jacobi: here rk is the residual after the kth iteration);
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1
(Gauss-Seidel: here rk,k+1 is the ‘residual’ after the kth iteration).
Thus the Jacobi and Gauss-Seidel use different approximations to the A−1 matrix.
In both cases, the rate of convergence slows down, as the the condition numberincreases.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 56 / 1
Comparison of Iterative Methods
All have first order convergence, i.e., ‖ek+1‖ ≤ ‖C‖ ‖ek‖,where C depends on the method used.
The similarities between the methods can be seen most easily if we write them inmatrix correction form:
xk+1 = xk + D−1(b − Axk) = xk + D−1rk
(Jacobi: here rk is the residual after the kth iteration);
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1
(Gauss-Seidel: here rk,k+1 is the ‘residual’ after the kth iteration).
Thus the Jacobi and Gauss-Seidel use different approximations to the A−1 matrix.
In both cases, the rate of convergence slows down, as the the condition numberincreases.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 56 / 1
Comparison of Iterative Methods
All have first order convergence, i.e., ‖ek+1‖ ≤ ‖C‖ ‖ek‖,where C depends on the method used.
The similarities between the methods can be seen most easily if we write them inmatrix correction form:
xk+1 = xk + D−1(b − Axk) = xk + D−1rk
(Jacobi: here rk is the residual after the kth iteration);
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1
(Gauss-Seidel: here rk,k+1 is the ‘residual’ after the kth iteration).
Thus the Jacobi and Gauss-Seidel use different approximations to the A−1 matrix.
In both cases, the rate of convergence slows down, as the the condition numberincreases.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 56 / 1
Comparison of Iterative Methods
All have first order convergence, i.e., ‖ek+1‖ ≤ ‖C‖ ‖ek‖,where C depends on the method used.
The similarities between the methods can be seen most easily if we write them inmatrix correction form:
xk+1 = xk + D−1(b − Axk) = xk + D−1rk
(Jacobi: here rk is the residual after the kth iteration);
xk+1 = xk + D−1(b − Lxk+1 − (D + U)xk) = xk + D−1rk,k+1
(Gauss-Seidel: here rk,k+1 is the ‘residual’ after the kth iteration).
Thus the Jacobi and Gauss-Seidel use different approximations to the A−1 matrix.
In both cases, the rate of convergence slows down, as the the condition numberincreases.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 56 / 1
Iterative Methods in the Real World
There are much more sophisticated iterative methods, including conjugategradients (CG), generalised minimal residuals (GMRES), and numerousrandomised methods.
More importantly, there are sophisticated means of preconditioning, i.e., loweringthe condition number.
These fall outside of our scope, but we will provide the briefest of overviews ofeach.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 57 / 1
Iterative Methods in the Real World
There are much more sophisticated iterative methods, including conjugategradients (CG), generalised minimal residuals (GMRES), and numerousrandomised methods.
More importantly, there are sophisticated means of preconditioning, i.e., loweringthe condition number.
These fall outside of our scope, but we will provide the briefest of overviews ofeach.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 57 / 1
Iterative Methods in the Real World
There are much more sophisticated iterative methods, including conjugategradients (CG), generalised minimal residuals (GMRES), and numerousrandomised methods.
More importantly, there are sophisticated means of preconditioning, i.e., loweringthe condition number.
These fall outside of our scope, but we will provide the briefest of overviews ofeach.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 57 / 1
Iterative Methods in the Real World: Randomisation
If one draws an i.i.d. random matrix S ∈ Rm×q at each iteration, one can applyan step, where xk+1 is the best approximation of x∗ in a random space passingthrough xk :
xk+1 = arg minx∈Rn||x −x∗||2B subject to x = xk +B−1ATSy , y is free (6.1)
where B is an n × n positive definite matrix B used to define the B-inner productand the induced B-norm by
〈x , y〉B := 〈Bx , y〉, ‖x‖B :=√〈x , x〉B , (6.2)
where 〈·, ·〉 is the standard Euclidean inner product. As it turns out, one can provevery strong convergence results for such methods.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 58 / 1
Iterative Methods in the Real World: Krylov Subspace
CG and GMRES can be explained as Krylov subspace methods , with iteration
xk+1 := arg minx∈Rn||x − x∗||2B subject to x ∈ x0 +Kk+1, (6.3)
where Kk+1 ⊂ Rn is a (k + 1)–dimensional subspace and the constraintx ∈ x0 +Kk+1 is an affine space that contains x0.
GMRES uses B = ATA in the objective ‖x − x∗‖2B and
CG uses B = A.
Alternatively, one can think in terms of the CayleyHamilton theorem: for anyinvertible A there exists a polynomial q of degree n1, such that q(A) = A1. Ineach iteration, we increase the allowable degree by 1
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 59 / 1
Iterative Methods in the Real World: Krylov Subspace
CG and GMRES can be explained as Krylov subspace methods , with iteration
xk+1 := arg minx∈Rn||x − x∗||2B subject to x ∈ x0 +Kk+1, (6.3)
where Kk+1 ⊂ Rn is a (k + 1)–dimensional subspace and the constraintx ∈ x0 +Kk+1 is an affine space that contains x0.
GMRES uses B = ATA in the objective ‖x − x∗‖2B and
CG uses B = A.
Alternatively, one can think in terms of the CayleyHamilton theorem: for anyinvertible A there exists a polynomial q of degree n1, such that q(A) = A1. Ineach iteration, we increase the allowable degree by 1
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 59 / 1
Iterative Methods in the Real World: Krylov Subspace
CG and GMRES can be explained as Krylov subspace methods , with iteration
xk+1 := arg minx∈Rn||x − x∗||2B subject to x ∈ x0 +Kk+1, (6.3)
where Kk+1 ⊂ Rn is a (k + 1)–dimensional subspace and the constraintx ∈ x0 +Kk+1 is an affine space that contains x0.
GMRES uses B = ATA in the objective ‖x − x∗‖2B and
CG uses B = A.
Alternatively, one can think in terms of the CayleyHamilton theorem: for anyinvertible A there exists a polynomial q of degree n1, such that q(A) = A1. Ineach iteration, we increase the allowable degree by 1
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 59 / 1
Iterative Methods in the Real World: Krylov Subspace
CG and GMRES can be explained as Krylov subspace methods , with iteration
xk+1 := arg minx∈Rn||x − x∗||2B subject to x ∈ x0 +Kk+1, (6.3)
where Kk+1 ⊂ Rn is a (k + 1)–dimensional subspace and the constraintx ∈ x0 +Kk+1 is an affine space that contains x0.
GMRES uses B = ATA in the objective ‖x − x∗‖2B and
CG uses B = A.
Alternatively, one can think in terms of the CayleyHamilton theorem: for anyinvertible A there exists a polynomial q of degree n1, such that q(A) = A1. Ineach iteration, we increase the allowable degree by 1
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 59 / 1
Iterative Methods in the Real World: Krylov Subspace
CG and GMRES can be explained as Krylov subspace methods , with iteration
xk+1 := arg minx∈Rn||x − x∗||2B subject to x ∈ x0 +Kk+1, (6.3)
where Kk+1 ⊂ Rn is a (k + 1)–dimensional subspace and the constraintx ∈ x0 +Kk+1 is an affine space that contains x0.
GMRES uses B = ATA in the objective ‖x − x∗‖2B and
CG uses B = A.
Alternatively, one can think in terms of the CayleyHamilton theorem: for anyinvertible A there exists a polynomial q of degree n1, such that q(A) = A1. Ineach iteration, we increase the allowable degree by 1
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 59 / 1
Iterative Methods in the Real World: Preconditioning
Many people solve P−1(Ax − b) = 0 instead of Ax − b = 0, with the hope thatP−1A has a lower the condition number than A:
xk+1 = xk − γkP−1(Axk − b). (6.4)
A non-singular preconditioner P is often problem-specific and applied in amatrix-free fashion, i.e., without ever instantiating P.
For example, Jacobi preconditioner uses P = diag(A).
Many other preconditioners approximate A−1.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 60 / 1
Iterative Methods in the Real World: Preconditioning
Many people solve P−1(Ax − b) = 0 instead of Ax − b = 0, with the hope thatP−1A has a lower the condition number than A:
xk+1 = xk − γkP−1(Axk − b). (6.4)
A non-singular preconditioner P is often problem-specific and applied in amatrix-free fashion, i.e., without ever instantiating P.
For example, Jacobi preconditioner uses P = diag(A).
Many other preconditioners approximate A−1.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 60 / 1
Iterative Methods in the Real World: Preconditioning
Many people solve P−1(Ax − b) = 0 instead of Ax − b = 0, with the hope thatP−1A has a lower the condition number than A:
xk+1 = xk − γkP−1(Axk − b). (6.4)
A non-singular preconditioner P is often problem-specific and applied in amatrix-free fashion, i.e., without ever instantiating P.
For example, Jacobi preconditioner uses P = diag(A).
Many other preconditioners approximate A−1.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 60 / 1
Iterative Methods in the Real World: Preconditioning
Many people solve P−1(Ax − b) = 0 instead of Ax − b = 0, with the hope thatP−1A has a lower the condition number than A:
xk+1 = xk − γkP−1(Axk − b). (6.4)
A non-singular preconditioner P is often problem-specific and applied in amatrix-free fashion, i.e., without ever instantiating P.
For example, Jacobi preconditioner uses P = diag(A).
Many other preconditioners approximate A−1.
Jakub Marecek and Sean McGarraghy (UCD) Numerical Analysis and Software October 8, 2015 60 / 1