Characterizing GMRES Convergence Part 1: Not Just Conditioning

October 16, 2020

Here I attack a common myth about GMRES convergence: that when solving a linear system \( Ax = b \) using GMRES the conditioning of \( A \) plays a large role in how many iterations GMRES takes to reach the user-prescribed tolerance (i.e. how expensive it will be). In fact the conditioning of \( A \) may have no relationship to the cost of GMRES whatsoever. Here I give an explicit family of matrices which have widely varying condition numbers but for which the number of iterations GMRES requires to solve \( Ax = b \) is completely independent of these condition numbers. These matrices are symmetric and recover good GMRES performance when their eigenvalues all have the same sign, but some matrices in this family have mixed eigenvalue signs and GMRES performance drops precipitously in this case even when well conditioned.

Condition Number Counter-Example

The family of matrices will be generated by three parameters \( n _ x, n _ y, \omega \), but for simplicity will fix \( n _ x = n _ y = 32 \) as these do not materially impact the results. The matrices are generated by a difference formula as follows

$$ \begin{aligned} (Ax) _ {i,j} &= \omega x _ {i,j} + x _ {i+1,j} + x _ {i-1,j} + x _ {i,j-1} + x _ {i,j+1} \end{aligned} $$

with all indexing outside of the boundaries taken to evaluate to \( 0 \). I give an example of a matrix for \( n _ x = n _ y = 4 \) below:

$$ \begin{bmatrix} \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} \end{bmatrix} $$

and python code for generating this matrix in symm2d below

def symm1d(n):
    A=sp.lil_matrix((n,n))
    A[0,1]=1.0
    A[n-1,n-2]=1.0
    for i in range(1,n-1):
        A[i,i-1]=1.0
        A[i,i+1]=1.0
    return A
def symm2d(nx,ny,omega):
    Ax=symm1d(nx)
    Ay=symm1d(ny)
    Ix=sp.eye(nx,nx)
    Iy=sp.eye(ny,ny)
    I=sp.eye(nx*ny,nx*ny)
    return sp.kron(Ax,Iy) + sp.kron(Ix,Ay) + omega*I

Results

GMRES Iterations as Function of \( \omega \)

Below I use the built-in SciPy GMRES and

ex1

GMRES requires the most iterations when \( \omega \approx 0 \). Now I show that this is not because of the resulting conditioning, as is often the conventional wisdom.

GMRES Iterations as Function of conditioning \( \kappa( A(\omega) ) \)

I now calculate the condition number of \( A \) for each \( \omega \) and see if it suggests any pattern between conditioning of \(A\) and the number of GMRES iterations we need to solve \( Ax = b \):

ex2

As I’ve already hinted, conditioning appears to have a very weak relevance to the final number of iterations that GMRES needs to achieve the desired tolerance. The issue for this particular family of matrices is the distribution of eigenvalues, which I discuss in more detail next.

GMRES Iterations as Function of Eigenvalue Signs

I will show in part 2 of this series that the eigenvalue of a matrix also do not completely determine GMRES behavior, but in the case of this family of matrices - all of which are symmetric - they do. One feature of the eigenvalues of a matrix that can impact GMRES performance are their signs. For symmetric matrices GMRES tends to perform better when all eigenvalues have the same sign. For the family of matrices I constructed here that is not universally the case, see below:

ex3ex4

For \( \omega = -4 \) all eigenvalues are negative, but for \( \omega = -2 \) some eigenvalues are positive. I create a new value to measure this as follows

$$ \begin{aligned} Q(A) = \frac{\min(\text{Number of positive eigenvalues of} A ,\text{Number of negative eigenvalues of } A)}{\text{Total number of eigenvalues of } A} \end{aligned} $$

Using this feature of a matrix \( A \) we get a much better correlation with GMRES iterations

ex5

Conclusions

Conditioning frequently is asserted to have a high correlation with the resulting cost of GMRES, but here I have shown that conditioning may also be completely decoupled from GMRES behavior too. I showed that eigenvalue distribution can however be significantly correlated with GMRES performance. I caution anyone reading this from reading too far into this specific case because eigenvalue distribution can also mislead. It turns out eigenvalue distribution will completely determine GMRES behavior if the matrix is normal (the matrices here are symmetric, and all symmetric matrices are also normal). If a matrix is non-normal then eigenvalue distribution may also fall short of predicting GMRES cost. I will show counter-examples in part 2 of this series where eigenvalue distribution can fail to predict GMRES cost.