Iterative Methods for Solving Linear Equations

Biostat/Biomath M257

Author

Dr. Hua Zhou @ UCLA

Published

May 3, 2023

System information (for reproducibility):

versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.5.0)
  CPU: 12 × Apple M2 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_NUM_THREADS = 8
  JULIA_EDITOR = code

Load packages:

using Pkg

Pkg.activate(pwd())
Pkg.instantiate()
Pkg.status()
  Activating project at `~/Documents/github.com/ucla-biostat-257/2023spring/slides/17-iterative`
Status `~/Documents/github.com/ucla-biostat-257/2023spring/slides/17-iterative/Project.toml`
  [6e4b80f9] BenchmarkTools v1.3.2
  [42fd0dbc] IterativeSolvers v0.9.2
  [b51810bb] MatrixDepot v1.0.10
  [b8865327] UnicodePlots v3.5.2
  [37e2e46d] LinearAlgebra
  [9a3f8284] Random

0.1 Introduction

So far we have considered direct methods for solving linear equations.

  • Direct methods (flops fixed a priori) vs iterative methods:
    • Direct method (GE/LU, Cholesky, QR, SVD): good for dense, small to moderate sized \(\mathbf{A}\)
    • Iterative methods (Jacobi, Gauss-Seidal, SOR, conjugate-gradient, GMRES): good for large, sparse, structured linear system, parallel computing, warm start

0.2 PageRank problem

  • \(\mathbf{A} \in \{0,1\}^{n \times n}\) the connectivity matrix of webpages with entries \[ a_{ij} = \begin{cases} 1 & \text{if page $i$ links to page $j$} \\ 0 & \text{otherwise} \end{cases}. \] \(n \approx 10^9\) in May 2017.

  • \(r_i = \sum_j a_{ij}\) is the out-degree of page \(i\).

  • Larry Page imagines a random surfer wandering on internet according to following rules:

    • From a page \(i\) with \(r_i>0\)
      • with probability \(p\), (s)he randomly chooses a link on page \(i\) (uniformly) and follows that link to the next page
      • with probability \(1-p\), (s)he randomly chooses one page from the set of all \(n\) pages (uniformly) and proceeds to that page
    • From a page \(i\) with \(r_i=0\) (a dangling page), (s)he randomly chooses one page from the set of all \(n\) pages (uniformly) and proceeds to that page

The process defines a Markov chain on the space of \(n\) pages. Stationary distribution of this Markov chain gives the ranks (probabilities) of each page.

  • Stationary distribution is the top left eigenvector of the transition matrix \(\mathbf{P}\) corresponding to eigenvalue 1. Equivalently it can be cast as a linear equation \[ (\mathbf{I} - \mathbf{P}^T) \mathbf{x} = \mathbf{0}. \]

  • Gene Golub: Largest matrix computation in world.

  • GE/LU will take \(2 \times (10^9)^3/3/10^{12} \approx 6.66 \times 10^{14}\) seconds \(\approx 20\) million years on a tera-flop supercomputer!

  • Iterative methods come to the rescue.

0.3 Jacobi method

Solve \(\mathbf{A} \mathbf{x} = \mathbf{b}\).

  • Jacobi iteration: \[ x_i^{(t+1)} = \frac{b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}. \]

  • With splitting: \(\mathbf{A} = \mathbf{L} + \mathbf{D} + \mathbf{U}\), Jacobi iteration can be written as \[ \mathbf{D} \mathbf{x}^{(t+1)} = - (\mathbf{L} + \mathbf{U}) \mathbf{x}^{(t)} + \mathbf{b}, \] i.e., \[ \mathbf{x}^{(t+1)} = - \mathbf{D}^{-1} (\mathbf{L} + \mathbf{U}) \mathbf{x}^{(t)} + \mathbf{D}^{-1} \mathbf{b} = - \mathbf{D}^{-1} \mathbf{A} \mathbf{x}^{(t)} + \mathbf{x}^{(t)} + \mathbf{D}^{-1} \mathbf{b}. \]

  • One round costs \(2n^2\) flops with an unstructured \(\mathbf{A}\). Gain over GE/LU if converges in \(o(n)\) iterations. Saving is huge for sparse or structured \(\mathbf{A}\). By structured, we mean the matrix-vector multiplication \(\mathbf{A} \mathbf{v}\) is fast (\(O(n)\) or \(O(n \log n)\)).

0.4 Gauss-Seidel method

  • Gauss-Seidel (GS) iteration: \[ x_i^{(t+1)} = \frac{b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}. \]

  • With splitting, \((\mathbf{D} + \mathbf{L}) \mathbf{x}^{(t+1)} = - \mathbf{U} \mathbf{x}^{(t)} + \mathbf{b}\), i.e., \[ \mathbf{x}^{(t+1)} = - (\mathbf{D} + \mathbf{L})^{-1} \mathbf{U} \mathbf{x}^{(t)} + (\mathbf{D} + \mathbf{L})^{-1} \mathbf{b}. \]

  • GS converges for any \(\mathbf{x}^{(0)}\) for symmetric and pd \(\mathbf{A}\).

  • Convergence rate of Gauss-Seidel is the spectral radius of the \((\mathbf{D} + \mathbf{L})^{-1} \mathbf{U}\).

0.5 Successive over-relaxation (SOR)

  • SOR: \[ x_i^{(t+1)} = \omega \left( b_i - \sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \sum_{j=i+1}^n a_{ij} x_j^{(t)} \right) / a_{ii} + (1-\omega) x_i^{(t)}, \] i.e., \[ \mathbf{x}^{(t+1)} = (\mathbf{D} + \omega \mathbf{L})^{-1} [(1-\omega) \mathbf{D} - \omega \mathbf{U}] \mathbf{x}^{(t)} + (\mathbf{D} + \omega \mathbf{L})^{-1} (\mathbf{D} + \mathbf{L})^{-1} \omega \mathbf{b}. \]

  • Need to pick \(\omega \in [0,1]\) beforehand, with the goal of improving convergence rate.

0.6 Conjugate gradient method

  • Conjugate gradient and its variants are the top-notch iterative methods for solving huge, structured linear systems.

  • A UCLA invention! Hestenes and Stiefel in 60s.

  • Solving \(\mathbf{A} \mathbf{x} = \mathbf{b}\) is equivalent to minimizing the quadratic function \(\frac{1}{2} \mathbf{x}^T \mathbf{A} \mathbf{x} - \mathbf{b}^T \mathbf{x}\).

Kershaw’s results for a fusion problem.

Method Number of Iterations
Gauss Seidel 208,000
Block SOR methods 765
Incomplete Cholesky conjugate gradient 25

0.7 MatrixDepot.jl

MatrixDepot.jl is an extensive collection of test matrices in Julia. After installation, we can check available test matrices by

using MatrixDepot

mdinfo()
[ Info: verify download of index files...
[ Info: reading database
[ Info: adding metadata...
[ Info: adding svd data...
[ Info: writing database
[ Info: used remote sites are sparse.tamu.edu with MAT index and math.nist.gov with HTML index

0.7.1 Currently loaded Matrices

builtin(#)
1 baart 13 fiedler 25 invhilb 37 parter 49 shaw
2 binomial 14 forsythe 26 invol 38 pascal 50 smallworld
3 blur 15 foxgood 27 kahan 39 pei 51 spikes
4 cauchy 16 frank 28 kms 40 phillips 52 toeplitz
5 chebspec 17 gilbert 29 lehmer 41 poisson 53 tridiag
6 chow 18 golub 30 lotkin 42 prolate 54 triw
7 circul 19 gravity 31 magic 43 randcorr 55 ursell
8 clement 20 grcar 32 minij 44 rando 56 vand
9 companion 21 hadamard 33 moler 45 randsvd 57 wathen
10 deriv2 22 hankel 34 neumann 46 rohess 58 wilkinson
11 dingdong 23 heat 35 oscillate 47 rosser 59 wing
12 erdrey 24 hilb 36 parallax 48 sampling
user(#)
Groups
all local eigen illcond posdef regprob symmetric
builtin user graph inverse random sparse
Suite Sparse of
2 2893
MatrixMarket of
0 498
# List matrices that are positive definite and sparse:
mdlist(:symmetric & :posdef & :sparse)
2-element Vector{String}:
 "poisson"
 "wathen"
# Get help on Poisson matrix
mdinfo("poisson")

1 Poisson Matrix (poisson)

A block tridiagonal matrix from Poisson’s equation. This matrix is sparse, symmetric positive definite and has known eigenvalues. The result is of type SparseMatrixCSC.

Input options:

  • [type,] dim: the dimension of the matrix is dim^2.

Groups: [“inverse”, “symmetric”, “posdef”, “eigen”, “sparse”]

References:

G. H. Golub and C. F. Van Loan, Matrix Computations, second edition, Johns Hopkins University Press, Baltimore, Maryland, 1989 (Section 4.5.4).

# Generate a Poisson matrix of dimension n=10
A = matrixdepot("poisson", 10)
100×100 SparseArrays.SparseMatrixCSC{Float64, Int64} with 460 stored entries:
⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠑⢄⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀⠱⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠱⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢆⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢆⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠑⢄⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀⠑⢄⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦⠀⠀⠑⢄
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠻⣦⡀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠈⠻⣦
using UnicodePlots
spy(A)
       ┌──────────────────────────────────────────┐    
     1 ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ > 0
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ < 0
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
   100 ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
       └──────────────────────────────────────────┘1⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀100⠀    
       ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀460 ≠ 0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
# Get help on Wathen matrix
mdinfo("wathen")

2 Wathen Matrix (wathen)

Wathen Matrix is a sparse, symmetric positive, random matrix arose from the finite element method. The generated matrix is the consistent mass matrix for a regular nx-by-ny grid of 8-nodes.

Input options:

  • [type,] nx, ny: the dimension of the matrix is equal to 3 * nx * ny + 2 * nx * ny + 1.
  • [type,] n: nx = ny = n.

Groups: [“symmetric”, “posdef”, “eigen”, “random”, “sparse”]

References:

A. J. Wathen, Realistic eigenvalue bounds for the Galerkin mass matrix, IMA J. Numer. Anal., 7 (1987), pp. 449-457.

# Generate a Wathen matrix of dimension n=5
A = matrixdepot("wathen", 5)
96×96 SparseArrays.SparseMatrixCSC{Float64, Int64} with 1256 stored entries:
⠿⣧⣀⠀⠸⣧⡀⠿⣧⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠘⢻⣶⡀⠘⣇⠀⠘⢻⣶⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠶⣦⣀⠈⠱⣦⡉⠶⣦⣀⠈⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⣤⡌⠉⠙⢣⡌⠛⣤⡌⠉⠙⢣⡄⠀⣤⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠉⢻⣶⣀⠈⢻⡆⠉⢻⣶⣀⠈⢻⣆⠉⠛⣷⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠘⠻⠆⠀⠷⣀⡀⠘⠻⢆⡀⠻⣀⡀⠘⠻⠶⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠉⠻⢶⣤⡈⠻⣦⠉⠛⢷⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠿⣧⠀⠀⠸⣧⠀⠿⣧⡄⠀⠸⣧⠀⠿⣧⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠙⢻⣶⡀⠙⣷⠀⠉⢻⣶⣀⠙⣷⡀⠉⢻⣶⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⠃⠀⠘⠶⣦⣄⠘⠻⣦⡘⠷⣦⣄⠘⠛⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣤⡄⠙⠻⢶⡌⠻⣦⡄⠙⠻⠶⡄⠀⢠⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠿⣧⣀⠈⢿⣄⠉⠿⣧⣀⠀⢿⣄⠈⠿⣧⣄⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⢻⣶⠀⢻⡆⠀⠘⢻⣶⠀⢻⡆⠀⠀⢻⣶⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠛⢷⣤⣀⠻⣦⡈⠛⠷⣦⣀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠶⣦⡄⠈⠉⣦⠈⠱⣦⡄⠈⠉⢶⠀⠰⣦⡄⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⢿⣤⣀⠹⣧⡀⠉⠿⣧⣀⠸⣧⡀⠉⠿⣧⣀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⠛⠀⠘⢣⣄⣀⡘⠛⣤⡘⢣⣄⣀⡘⠛
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡀⠉⠻⠶⣈⠻⢆⡀⠉⠻⠶
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠿⣧⡄⠀⢹⡄⠈⠿⣧⡄⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⢻⣶⠈⢻⡆⠀⠉⢻⣶
spy(A)
      ┌──────────────────────────────────────────┐    
    1 ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ > 0
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ < 0
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
   96 ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
      └──────────────────────────────────────────┘1⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀96⠀    
      ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀1 256 ≠ 0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    

2.1 Numerical examples

The IterativeSolvers.jl package implements most commonly used iterative solvers.

2.1.1 Generate test matrix

using BenchmarkTools, IterativeSolvers, LinearAlgebra, MatrixDepot, Random

Random.seed!(280)

n = 100
# Poisson matrix of dimension n^2=10000, pd and sparse
A = matrixdepot("poisson", n)
@show typeof(A)
# dense matrix representation of A
Afull = convert(Matrix, A)
@show typeof(Afull)
# sparsity level
count(!iszero, A) / length(A)
typeof(A) = SparseArrays.SparseMatrixCSC{Float64, Int64}
typeof(Afull) = Matrix{Float64}
0.000496
spy(A)
          ┌──────────────────────────────────────────┐    
        1 ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ > 0
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ < 0
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
   10 000 ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
          └──────────────────────────────────────────┘1⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀10 000⠀    
          ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀49 600 ≠ 0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀    
# storage difference
Base.summarysize(A), Base.summarysize(Afull)
(873768, 800000040)

2.1.2 Matrix-vector muliplication

# randomly generated vector of length n^2
b = randn(n^2)
# dense matrix-vector multiplication
@benchmark $Afull * $b
BenchmarkTools.Trial: 401 samples with 1 evaluation.
 Range (minmax):  12.378 ms12.714 ms   GC (min … max): 0.00% … 0.00%
 Time  (median):     12.431 ms               GC (median):    0.00%
 Time  (mean ± σ):   12.480 ms ± 92.225 μs   GC (mean ± σ):  0.00% ± 0.00%
     ▂▄▅█▅▅▇                                                  
  ▂▄▆███████▆▇▅▅▄▅▄▃▃▃▃▃▂▂▃▃▄▁▁▁▁▂▁▁▂▂▂▁▅▄▅▃▄▆▆▄▅▅▆▄▆▅▁▃▁▂▃ ▄
  12.4 ms         Histogram: frequency by time        12.7 ms <
 Memory estimate: 78.17 KiB, allocs estimate: 2.
# sparse matrix-vector multiplication
@benchmark $A * $b
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (minmax):  35.250 μs 11.939 ms   GC (min … max): 0.00% … 99.46%
 Time  (median):     40.250 μs                GC (median):    0.00%
 Time  (mean ± σ):   42.514 μs ± 119.086 μs   GC (mean ± σ):  2.79% ±  0.99%
         ▇█▄▃                                              
  ▁▁▁▁▁▁▆█████▇▆▅▃▃▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  35.2 μs         Histogram: frequency by time         62.9 μs <
 Memory estimate: 78.17 KiB, allocs estimate: 2.

2.1.3 Dense solve via Cholesky

# record the Cholesky solution
xchol = cholesky(Afull) \ b;
# dense solve via Cholesky
@benchmark cholesky($Afull) \ $b
BenchmarkTools.Trial: 5 samples with 1 evaluation.
 Range (minmax):  1.139 s 1.150 s   GC (min … max): 0.00% … 0.02%
 Time  (median):     1.144 s              GC (median):    0.02%
 Time  (mean ± σ):   1.145 s ± 4.085 ms   GC (mean ± σ):  0.23% ± 0.34%
  █                                        █      █  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁█ ▁
  1.14 s        Histogram: frequency by time        1.15 s <
 Memory estimate: 763.02 MiB, allocs estimate: 4.

2.1.4 Jacobi solver

It seems that Jacobi solver doesn’t give the correct answer.

xjacobi = jacobi(A, b)
@show norm(xjacobi - xchol)
norm(xjacobi - xchol) = 498.7144076230415
498.7144076230415

Reading documentation we found that the default value of maxiter is 10. A couple trial runs shows that 30000 Jacobi iterations are enough to get an accurate solution.

xjacobi = jacobi(A, b, maxiter = 30000)
@show norm(xjacobi - xchol)
norm(xjacobi - xchol) = 1.6268361036148366e-5
1.6268361036148366e-5
@benchmark jacobi($A, $b, maxiter = 30000)
BenchmarkTools.Trial: 5 samples with 1 evaluation.
 Range (minmax):  1.205 s 1.223 s   GC (min … max): 0.00% … 0.00%
 Time  (median):     1.213 s              GC (median):    0.00%
 Time  (mean ± σ):   1.214 s ± 7.310 ms   GC (mean ± σ):  0.00% ± 0.00%
  █           █                     █      █  
  █▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁█ ▁
  1.21 s        Histogram: frequency by time        1.22 s <
 Memory estimate: 2.06 MiB, allocs estimate: 30008.

2.1.5 Gauss-Seidal

# Gauss-Seidel solution is fairly close to Cholesky solution after 15000 iters
xgs = gauss_seidel(A, b, maxiter = 15000)
@show norm(xgs - xchol)
norm(xgs - xchol) = 1.4292903814087475e-5
1.4292903814087475e-5
@benchmark gauss_seidel($A, $b, maxiter = 15000)
BenchmarkTools.Trial: 4 samples with 1 evaluation.
 Range (minmax):  1.385 s 1.388 s   GC (min … max): 0.00% … 0.00%
 Time  (median):     1.386 s              GC (median):    0.00%
 Time  (mean ± σ):   1.386 s ± 1.419 ms   GC (mean ± σ):  0.00% ± 0.00%
  █     █                                      █  
  █▁▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.39 s        Histogram: frequency by time        1.39 s <
 Memory estimate: 156.58 KiB, allocs estimate: 7.

2.1.6 SOR

# symmetric SOR with ω=0.75
xsor = ssor(A, b, 0.75, maxiter = 10000)
@show norm(xsor - xchol)
norm(xsor - xchol) = 0.00029800912849438084
0.00029800912849438084
@benchmark sor($A, $b, 0.75, maxiter = 10000)
BenchmarkTools.Trial: 5 samples with 1 evaluation.
 Range (minmax):  1.131 s 1.135 s   GC (min … max): 0.00% … 0.00%
 Time  (median):     1.132 s              GC (median):    0.00%
 Time  (mean ± σ):   1.133 s ± 1.399 ms   GC (mean ± σ):  0.00% ± 0.00%
  █      █       █                           █  
  █▁▁▁▁▁▁█▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.13 s        Histogram: frequency by time        1.14 s <
 Memory estimate: 547.27 KiB, allocs estimate: 20009.

2.1.7 Conjugate Gradient (preview of next lecture)

# conjugate gradient
xcg = cg(A, b)
@show norm(xcg - xchol)
norm(xcg - xchol) = 1.223256740471929e-5
1.223256740471929e-5
@benchmark cg($A, $b)
BenchmarkTools.Trial: 282 samples with 1 evaluation.
 Range (minmax):  17.306 ms36.838 ms   GC (min … max): 0.00% … 0.00%
 Time  (median):     17.629 ms               GC (median):    0.00%
 Time  (mean ± σ):   17.761 ms ±  1.182 ms   GC (mean ± σ):  0.00% ± 0.00%
   ▂▇▄▁█▁▁ ▁▄▃  ▃▁ ▃▁▃      ▁                                  
  ████████▆████▇█████▅▆██▆█▆█▄▇▇▅▇▃▆▇▅▇▁▅▄▄▆▃▅▄▁▁▅▁▄▁▃▃▁▁▃▃ ▄
  17.3 ms         Histogram: frequency by time        18.5 ms <
 Memory estimate: 313.39 KiB, allocs estimate: 16.