This post talks about how to exploit special structure in matrices to perform fast linear system solutions involving them. Mike Croucher in a post over at the NAG blog showed an example of Toeplitz systems, achieving a speedup of 100x over a more generic solver. I show here how to still use that Toeplitz solver even if your matrix is lightly modified, and so no longer Toeplitz.

We live in a world based on linear algebra. Whether you are doing statistics, machine learning, or simulating physics. The core computational burden of these tasks often rests within computational linear algebra. You don’t need to take my word for it, just look at Jack Dongarra’s huge (and growing) list of computational linear algebra software, and my employer’s main software product, the NAG library The *fastest* routines usually exploit some kind of structure in a matrix - see for example the NAG routine real_toeplitz_solve which requires a Toeplitz,symmetric, *and* positive definite input! What do we do when our inputs don’t satisfy these strict requirements? I show here there is sometimes a way around this: low-rank updates. I start with the math and then give an example in Python.

The idea of a low-rank update in linear algebra is that we have some matrix \( A \) which has desirable structure, but we actually want to do computations on a *related* matrix \( B \) such that

\[ B = A + uv^T \]

Here \(uv^T\) is the outer product of vectors \(u\) and \(v\). This is often called a *low rank update* because \(uv^T\) is a low rank matrix. Indeed it is simply rank \( 1 \). It is possible to generalize to rank \(k>1\) as well (see concluding remarks below), but I focus on \(k=1\) for simplicity of presentation here.

For concreteness let’s assume here that we want to solve a linear system \(Bx=b\) for \(x\). In the NAG library we might simply form \(B\) explicitly and pass it to the routine nag_real_gen_lin_solve along with the right-hand-side vector \(b\). But remember, we want *fast*! What if for example \(A\) has this magical Toeplitz property such that a fast Toeplitz solver may be used on it? There is pretty much no chance that \(B\) inherits this property, but we can still exploit it thanks to the Sherman-Morrison formula.

The Sherman-Morrison formula tells us that

\[ B^{-1}b = (A+uv^T)^{-1}b = A^{-1}b - \frac{v^T A^{-1} b}{1+v^T A^{-1}u} A^{-1}u \]

Thus we can complete a solve with \(B\) so long as we know how to complete a few solves with \(A\)! This formula is simple to implement in Python using NumPy matrices. I now give such an example.

The following script is commented and is self-contained. Feel free to play around with it.

In the below example \( A \) is circulant, which permits ultrafast solves by way of FFT. I then ruin this circulant property by adding in a low-rank update to \( A \) to form \( B \), but I show how to still achieve fast solves with \( B \) by exploiting both the circulant property and the low rank update property.

```
import numpy as np
import scipy.fftpack
import scipy.linalg
#Make A a circulant matrix. Linear equations
#with circulant matrices can be solved in O(n*log(n))
#because they are equivalent to a few FFTs.
#Compare this to the more typical O(n^3)
#if we used a generic linear equation solver that doesn't
#exploit the circulant property
#Make n small so we can print outputs easier
n=5
c=np.random.rand(n)
#Our circulant matrix
A=scipy.linalg.circulant(c)
#Quick-and-dirty circulant matrix equation solver for Ax=b
def solveA(b):
return np.real(scipy.fftpack.ifft(scipy.fftpack.fft(b)/scipy.fftpack.fft(c)))
#Now form B as a low rank update of A: B=A+u*v^t
u=np.random.rand(n)
v=np.random.rand(n)
B=A+np.outer(u,v)
#Implement Sherman-morrison update solver:
def sherman_morrison(solve,u,v,b):
sb=solve(b)
su=solve(u)
alpha=1.0/(1.0+np.dot(v,su))
beta=np.dot(v,sb)
return sb - alpha*beta*su
#make a random right-hand-side b
b=np.random.rand(n)
#First solve with a "generic" solver for comparison purposes
print(np.linalg.solve(B,b))
#Now solve with our fast circulant+lowrank trick
print(sherman_morrison(solveA,u,v,b))
```

I often think of low-rank updates as small perturbations of a matrix, thus if we can closely approximate (but not exactly match) a given matrix \(M\) with a well-structured matrix \(A\) then we can often complete the approximation by adding in a simple low-rank update - thus we can still exploit the structure of \(A\) but we more closely match \(M\) than if we simply used \(A\) alone. The trick of course is figuring out what structure we want \(A\) to have, and having established that structure how do we compute \(u,v\). This is often where domain knowledge enters from outside linear algebra. I only gave the example here of rank \(1\) updates, but actually rank \(k>1\) are also possible through the generalized woodbury matrix identity.