Characterizing GMRES Convergence Part 1: Not Just Conditioning

Here I attack a common myth about GMRES convergence: that when solving a linear system $ Ax = b $ using GMRES the conditioning of $ A $ plays a large role in how many iterations GMRES takes to reach the user-prescribed tolerance (i.e. how expensive it will be). In fact the conditioning of $ A $ may have no relationship to the cost of GMRES whatsoever. Here I give an explicit family of matrices which have widely varying condition numbers but for which the number of iterations GMRES requires to solve $ Ax = b $ is completely independent of these condition numbers. These matrices are symmetric and recover good GMRES performance when their eigenvalues all have the same sign, but some matrices in this family have mixed eigenvalue signs and GMRES performance drops precipitously in this case even when well conditioned.

Condition Number Counter-Example

The family of matrices will be generated by three parameters $ n _ x, n _ y, \omega $, but for simplicity will fix $ n _ x = n _ y = 32 $ as these do not materially impact the results. The matrices are generated by a difference formula as follows

with all indexing outside of the boundaries taken to evaluate to $ 0 $. I give an example of a matrix for $ n _ x = n _ y = 4 $ below:

$$ \begin{bmatrix} \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & 0.0 & 0.0 & 0.0 & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & 0.0 & \mathbf{\omega} & \mathbf{1.0} & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} & 0.0 \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} & \mathbf{1.0} \\ 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & 0.0 & \mathbf{1.0} & 0.0 & 0.0 & \mathbf{1.0} & \mathbf{\omega} \end{bmatrix} $$

def symm1d(n):
    A=sp.lil_matrix((n,n))
    A[0,1]=1.0
    A[n-1,n-2]=1.0
    for i in range(1,n-1):
        A[i,i-1]=1.0
        A[i,i+1]=1.0
    return A
def symm2d(nx,ny,omega):
    Ax=symm1d(nx)
    Ay=symm1d(ny)
    Ix=sp.eye(nx,nx)
    Iy=sp.eye(ny,ny)
    I=sp.eye(nx*ny,nx*ny)
    return sp.kron(Ax,Iy) + sp.kron(Ix,Ay) + omega*I

Results

GMRES Iterations as Function of $ \omega $

GMRES requires the most iterations when $ \omega \approx 0 $. Now I show that this is not because of the resulting conditioning, as is often the conventional wisdom.

GMRES Iterations as Function of conditioning $ \kappa( A(\omega) ) $

I now calculate the condition number of $ A $ for each $ \omega $ and see if it suggests any pattern between conditioning of $A$ and the number of GMRES iterations we need to solve $ Ax = b $:

As I’ve already hinted, conditioning appears to have a very weak relevance to the final number of iterations that GMRES needs to achieve the desired tolerance. The issue for this particular family of matrices is the distribution of eigenvalues, which I discuss in more detail next.

GMRES Iterations as Function of Eigenvalue Signs

I will show in part 2 of this series that the eigenvalue of a matrix also do not completely determine GMRES behavior, but in the case of this family of matrices - all of which are symmetric - they do. One feature of the eigenvalues of a matrix that can impact GMRES performance are their signs. For symmetric matrices GMRES tends to perform better when all eigenvalues have the same sign. For the family of matrices I constructed here that is not universally the case, see below:

For $ \omega = -4 $ all eigenvalues are negative, but for $ \omega = -2 $ some eigenvalues are positive. I create a new value to measure this as follows

Using this feature of a matrix $ A $ we get a much better correlation with GMRES iterations

Conclusions

Conditioning frequently is asserted to have a high correlation with the resulting cost of GMRES, but here I have shown that conditioning may also be completely decoupled from GMRES behavior too. I showed that eigenvalue distribution can however be significantly correlated with GMRES performance. I caution anyone reading this from reading too far into this specific case because eigenvalue distribution can also mislead. It turns out eigenvalue distribution will completely determine GMRES behavior if the matrix is normal (the matrices here are symmetric, and all symmetric matrices are also normal). If a matrix is non-normal then eigenvalue distribution may also fall short of predicting GMRES cost. I will show counter-examples in part 2 of this series where eigenvalue distribution can fail to predict GMRES cost.

Characterizing GMRES Convergence Part 1: Not Just Conditioning

Condition Number Counter-Example

Results

GMRES Iterations as Function of \( \omega \)

GMRES Iterations as Function of conditioning \( \kappa( A(\omega) ) \)

GMRES Iterations as Function of Eigenvalue Signs

Conclusions