Sweep operator

Biostat/Biomath M257

Author

Dr. Hua Zhou @ UCLA

Published

April 23, 2023

System information (for reproducibility):

versioninfo()

Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.5.0)
  CPU: 12 × Apple M2 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_NUM_THREADS = 8
  JULIA_EDITOR = code

Load packages:

using Pkg

Pkg.activate(pwd())
Pkg.instantiate()
Pkg.status()

Status `~/Documents/github.com/ucla-biostat-257/2023spring/slides/14-sweep/Project.toml`
  [7522ee7d] SweepOperator v0.3.3
  [37e2e46d] LinearAlgebra
  [9a3f8284] Random

  Activating project at `~/Documents/github.com/ucla-biostat-257/2023spring/slides/14-sweep`

1 Definition

We learnt Cholesky decomposition and QR decomposition approaches for solving linear regression.
The popular statistical software SAS uses sweep operator for linear regression and matrix inversion.
Assume \(\mathbf{A}\) is symmetric and positive semidefinite.
Sweep on the \(k\)-th diagonal entry \(a_{kk} \ne 0\) yields \(\widehat A\) with entries \[ \begin{eqnarray*} \widehat a_{kk} &=& - \frac{1}{a_{kk}} \\ \widehat a_{ik} &=& \frac{a_{ik}}{a_{kk}} \\ \widehat a_{kj} &=& \frac{a_{kj}}{a_{kk}} \\ \widehat a_{ij} &=& a_{ij} - \frac{a_{ik} a_{kj}}{a_{kk}}, \quad i \ne k, j \ne k. \end{eqnarray*} \] \(n^2\) flops (taking into account of symmetry).
Inverse sweep sends \(\mathbf{A}\) to \(\check A\) with entries \[ \begin{eqnarray*} \check a_{kk} &=& - \frac{1}{a_{kk}} \\ \check a_{ik} &=& - \frac{a_{ik}}{a_{kk}} \\ \check a_{kj} &=& - \frac{a_{kj}}{a_{kk}} \\ \check a_{ij} &=& a_{ij} - \frac{a_{ik}a_{kj}}{a_{kk}}, \quad i \ne k, j \ne k. \end{eqnarray*} \] \(n^2\) flops (taking into account of symmetry).
\(\check{\hat{\mathbf{A}}} = \mathbf{A}\).
Successively sweeping all diagonal entries of \(\mathbf{A}\) yields \(- \mathbf{A}^{-1}\).
Exercise: invert a \(2 \times 2\) matrix, say \[ \mathbf{A} = \begin{pmatrix} 4 & 3 \\ 3 & 2 \end{pmatrix}, \] on paper using sweep operator.
Block form of sweep: Let the symmetric matrix \(\mathbf{A}\) be partitioned as \[ \mathbf{A} = \begin{pmatrix} \mathbf{A}_{11} & \mathbf{A}_{12} \\ \mathbf{A}_{21} & \mathbf{A}_{22} \end{pmatrix}. \] If possible, sweep on the diagonal entries of \(\mathbf{A}_{11}\) yields
\[ \begin{pmatrix} \, - \mathbf{A}_{11}^{-1} & \mathbf{A}_{11}^{-1} \mathbf{A}_{12} \\ \mathbf{A}_{21} \mathbf{A}_{11}^{-1} & \mathbf{A}_{22} - \mathbf{A}_{21} \mathbf{A}_{11}^{-1} \mathbf{A}_{12} \end{pmatrix}. \]
Order dose not matter. The block \(\mathbf{A}_{22} - \mathbf{A}_{21} \mathbf{A}_{11}^{-1} \mathbf{A}_{12}\) is recognized as the Schur complement of \(\mathbf{A}_{11}\).
Pd and determinant: \(\mathbf{A}\) is pd if and only if each diagonal entry can be swept in succession and is positive until it is swept. When a diagonal entry of a pd matrix \(\mathbf{A}\) is swept, it becomes negative and remains negative thereafter. Taking the product of diagonal entries just before each is swept yields the determinant of \(\mathbf{A}\).

2 Applications

2.1 Linear regression (as done in SAS)

Sweep on the diagonal entries 1 to \(p\) of the (augmented) Gram matrix \[ \begin{pmatrix} \mathbf{X}, \mathbf{y} \end{pmatrix}^T \begin{pmatrix} \mathbf{X}, \mathbf{y} \end{pmatrix} = \begin{pmatrix} \mathbf{X}^T \mathbf{X} & \mathbf{X}^T \mathbf{y} \\ \mathbf{y}^T \mathbf{X} & \mathbf{y}^T \mathbf{y} \end{pmatrix} \]
yields
\[ \begin{eqnarray*} \begin{pmatrix} \, - (\mathbf{X}^T \mathbf{X})^{-1} & (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} \\ \mathbf{y}^T \mathbf{X} (\mathbf{X}^T \mathbf{X})^{-1} & \mathbf{y}^T \mathbf{y} - \mathbf{y}^T \mathbf{X} (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} \end{pmatrix} = \begin{pmatrix} \, - \sigma^{-2} \text{Var}(\widehat{\beta}) & \widehat{\beta} \\ \widehat{\beta}^T & \|\mathbf{y} - \widehat{\mathbf{y}}\|_2^2 \end{pmatrix}. \end{eqnarray*} \]
In total \(np^2 + p^3\) flops.

2.2 Invert a matrix in place

\(n^3\) flops. Recall that inversion by Cholesky costs \((1/3)n^3 + (4/3) n^3 = (5/3) n^3\) flops: potrf and potri.

2.3 Conditional multivariate normal density calculation

2.4 Stepwise regression

2.5 MANOVA

3 Implementation

SweepOperator.jl package (by Josh Day) implements the sweep operator.

using SweepOperator

A = [9. 2 -2; 2 1 0; -2 0 4]

3×3 Matrix{Float64}:
  9.0  2.0  -2.0
  2.0  1.0   0.0
 -2.0  0.0   4.0

B = copy(A)
sweep!(B, 1) # sweep (1, 1) entry

3×3 Matrix{Float64}:
 -0.111111  0.222222  -0.222222
  2.0       0.555556   0.444444
 -2.0       0.0        3.55556

sweep!(B, 2) # sweep (2, 2) entry

3×3 Matrix{Float64}:
 -0.2   0.4  -0.4
  2.0  -1.8   0.8
 -2.0   0.0   3.2

sweep!(B, 3) # sweep (3, 3) entry, we are left with -inv(B)

3×3 Matrix{Float64}:
 -0.25   0.5  -0.125
  2.0   -2.0   0.25
 -2.0    0.0  -0.3125

# check correctness
inv(A)

3×3 Matrix{Float64}:
  0.25   -0.5    0.125
 -0.5     2.0   -0.25
  0.125  -0.25   0.3125

using LinearAlgebra

# sweep! function only changes the upper triangular part
UpperTriangular(inv(A)) ≈ UpperTriangular(- B)

true

# inverse sweep to bring negative inverse back to original matrix
sweep!(B, 1:3, true)

3×3 Matrix{Float64}:
  9.0  2.0  -2.0
  2.0  1.0  -1.11022e-16
 -2.0  0.0   4.0

4 Further reading

Section 7.4-7.6 of Numerical Analysis for Statisticians by Kenneth Lange (2010). Probably the best place to read about sweep operator.
The paper A tutorial on the SWEEP operator by James H. Goodnight (1979). Note this version (anti-symmetry matrix) is slightly different from ours.