Characterizing GMRES Convergence Part 1.4: Preconditioning Example

October 23, 2020

Here I discuss preconditioning strategies for GMRES and how fixating too much on condition numbers, as shown n my last post can lead one to falsely discard a useful preconditioner. Preconditioning strategies seek to find find a matrix \( M \) such that the preconditioned linear system \( M^{-1} A x = M^{-1} b \) takes fewer GMRES iterations than the unpreconditioned system. I have already shown via counter-example that conditioning can have very weak relevance to GMRES performance, but I show here a concrete system \( Ax=b \) and a preconditioning matrix \( M \) such that the preconditioned system actually has greater condition number than the unpreconditioned system and yet takes fewer GMRES iterations to solve. This does not add any extra weight to my previous post but it does put in more practical terms the importance of looking beyond conditioning when considering techniques for GMRES.

The matrix family \( A(\omega) \)

The family of matrices will be generated by three parameters \( n _ x, n _ y, \omega \), but for simplicity will fix \( n _ x = n _ y = 32 \) as these do not materially impact the results. The matrices are generated by a difference formula as follows

$$ \begin{aligned} (Ax) _ {i,j} &= \omega x _ {i,j} + x _ {i+1,j} + x _ {i-1,j} + x _ {i,j-1} + x _ {i,j+1} \end{aligned} $$

all indexing outside of the boundaries taken to evaluate to \( 0 \). I give an example of a matrix for \( n _ x = n _ y = 4 \) below:

$$ A(\omega) = \begin{bmatrix} \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} \end{bmatrix} $$

and python code for generating this matrix in symm2d below

def symm1d(n):
    A=sp.lil_matrix((n,n))
    A[0,1]=1.0
    A[n-1,n-2]=1.0
    for i in range(1,n-1):
        A[i,i-1]=1.0
        A[i,i+1]=1.0
    return A
def symm2d(nx,ny,omega):
    Ax=symm1d(nx)
    Ay=symm1d(ny)
    Ix=sp.eye(nx,nx)
    Iy=sp.eye(ny,ny)
    I=sp.eye(nx*ny,nx*ny)
    return sp.kron(Ax,Iy) + sp.kron(Ix,Ay) + omega*I

The preconditioner matrix \( M(\omega) \)

For the preconditioner I use a simple strategy called domain decomposition where in addition to the size parameters \(n _ x,n _ y\) I choose blocking parameters \(b _ x, b _ y \) and iterate over blocks of size \( b _ x \times b _ y \) where any iteration which occurs outside of this block the boundary condition is applied (normally the boundary conditions are only applied on the whole domain rather than on subdomains). This results in dropped nonzeros and has multiple computational benefits. If one simply uses a sparse LU factorization then the domain-decomposed matrix \( M \) typically will have significantly less fill-in than if you do a sparse LU factorization on the whole matrix \( A \), leading to cheaper and more memory efficient factorization. Alternatively one can solve the subdomains in parallel, choosing parameters \( b _ x, b _ y \) to match the parallel capability of the computer it will run on. I give below an example of a matrix \( M(\omega) \) below using the parameters \( n _ x = n _ y = 4 \) and \( b _ x = b _ y = 2 \)

$$ M(\omega) = \begin{bmatrix} \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} \end{bmatrix} $$

and code for this mirrors the code for \( A(\omega) \):

def symm1d_decomp(n,b):
    A=sp.lil_matrix((n,n))

    for i in range(0,n,b):
        for j in range(i,min(i+b,n)):
            if j>i:
                A[j,j-1]=1.0
            if j<min(i+b,n)-1:
                A[j,j+1]=1.0
    return A

def symm2d_decomp(nx,ny,bx,by,omega):
    Ax=symm1d_decomp(nx,bx)
    Ay=symm1d_decomp(ny,by)
    Ix=sp.eye(nx,nx)
    Iy=sp.eye(ny,ny)
    I=sp.eye(nx*ny,nx*ny)
    return sp.kron(Ax,Iy) + sp.kron(Ix,Ay) + omega*I

Results

Note that all results below take different values of the parameter \( -4 \leq \omega \leq 4 \) but I fix \( n _ x = n _ y = 32 \) for \(A(\omega)\) and \( b _ x = b _ y = 8 \) for \( M(\omega) \).

GMRES Iterations as Function of \( \omega \)

Below I use the built-in SciPy GMRES and

ex1

GMRES requires the most iterations when \( \omega \approx 0 \). As in previous post this does not necessarily correlate with conditioning

GMRES Iterations as Function of conditioning \( \kappa( A(\omega) ) \) and \( \kappa( M(\omega)^{-1} A(\omega) ) \)

I now calculate the condition number of unpreconditioned \( A \) and preconditioned \( M^{-1} A \) for each \( \omega \) and see if it suggests any pattern between conditioning of \(A\) and the number of GMRES iterations we need to solve \( Ax = b \):

ex2

Even in the preconditioned case, it does not really appear that reducing condition number necessarily results in fewer GMRES iterations.

GMRES Iterations as Function of Eigenvalue Signs

Since conditioning still does not predict GMRES performance in the case of preconditioning I look again at eigenvalue signs. I will show in an upcoming post that this metric also will fail in important cases, but for this specific case it has explanatory power. The metric is calculated as follows:

$$ \begin{aligned} Q(A) = \frac{\min(\text{Number of positive eigenvalues of} A ,\text{Number of negative eigenvalues of } A)}{\text{Total number of eigenvalues of } A} \end{aligned} $$

ex5

Conclusions

Previously I showed a realistic matrix family where conditioning does not predict GMRES performance. I did this because conditioning can lead to a very skewed and incorrect view of what preconditioning strategies will benefit specific problems. Therefore I decided to follow up that counter-example and show what happens with a realistic domain-decomposition preconditioner that actually increases the conditioning of a matrix. It turns out that despite frequently worsening the condition number, the preconditioned system actually takes fewer iterations. In upcoming posts I will detail the subtle reasons this happens.