Does QR Factorization Solve Linear Systems More Accurately Than Dense LU Factorization?

January 11, 2021

Here I investigate whether QR factorization can have superior accuracy to LU factorization with partial pivoting. QR Factorization has many stability advantages over LU factorization which suggests a possibility that a well implemented QR factorization could produce a more accurate solution to a system of linear equations \( Ax=b \). I construct a numerical experiment based on a family of matrices which represents a wide range of troublesome numerical behavior, such as lack of symmetry, severe non-normality, and poor conditioning and for a random sample of matrices in this family I solve the same system of linear equations with QR factorization and with LU factorization with partial pivoting. I do this experiment in C++ using LAPACK.

Rationale

QR factorization has favorable stability properties compared to LU factorization. The big reason is it can be implemented with highly stable transformations like Householder reflectors the stability properties of which is independent of the magnitude of a pivot value. LU factorization on the other hand requires some variant of Gaussian elimination the stability of which requires one to assume that pivot values do not decay too rapidly. QR factorization is not typically used for solving systems of linear equations except when the underlying matrix is rank-deficient and least-squares solutions are desired. For the full-rank case LU factorization is cheaper and the required pivoting does not usually result in a slower algorithm because a well-implemented LU factorization will have its chief bottleneck in level 3 BLAS which is significantly more expensive than any pivoting could be. It still is interesting to me whether the stability properties of QR factorization could result in more accurate solutions however.

Experiment Setup

I construct a family of matrices and sample from this sample randomly. For each sample I solve the same system of linear equations with QR factorization and again with LU factorization. I then compute the relative error from the true solution. I also compute condition numbers of each matrix to give an impression of the relative difficulty of solving systems with them.

Family of matrices

The family of matrices is generated with the following Python code

def make_matrix(nx,ny,coeffs):
    m=nx*ny
    A=np.zeros((m,m))
    def id(ix,iy):
        return ix + nx*iy

    for ix in range(0,nx):
        for iy in range(0,ny):
            A[id(ix,iy),id(ix,iy)]=coeffs[0]
            if ix>0:
                A[id(ix-1,iy),id(ix,iy)]=coeffs[1]
            if ix<nx-1:
                A[id(ix+1,iy),id(ix,iy)]=coeffs[2]
            if iy>0:
                A[id(ix,iy-1),id(ix,iy)]=coeffs[3]
            if iy<ny-1:
                A[id(ix,iy+1),id(ix,iy)]=coeffs[4]
    return A

the family of matrices defined by the function make_matrix is fully defined by the parameters nx,ny,coeffs.

Pseudocode of experiment

Ordinarily I would prefer to do everything in SciPy but SciPy does not have seem to have a way to multply with the resulting matrix \( Q ^ T \) from a QR factorization except by forming the matrix \( Q \) explicitly. Since I am specifically interested in the errors resulting from a quality implementation of a QR-based solve I think it’s appropriate that application of \(Q\) be done not by forming it explicitly but rather directly from its representation using Householder reflectors - which will incur less roundoff because of less required arithmetic. As a result I have done experiment is done in C++ so as to have easier access to LAPACK, but for simplicity I outline the approach taken in python-like pseudocode

import numpy as np

nsamples=10000
nx=32
ny=32
m=nx*ny
soln=np.random.rand(m)
for i in range(0,nsamples):
    #Generate random matrix with uniform coefficients between -4 and 4
    coeffs=np.random.uniform(-4.0,4.0,5)
    A=make_matrix(nx,ny,coeffs)
    #Manufacture RHS which results in desired solution
    b=A@soln
    #Solve with QR factorization
    xqr=qrsolve(A,b)
    #Solve with LU factorization
    xlu=lusolve(A,b)
    #get relative error for QR solve
    qr_err = np.amax( np.abs(xqr - soln)/np.abs(soln) )
    #get relative error for LU solve
    qr_err = np.amax( np.abs(xqr - soln)/np.abs(soln) )
    #Compute other quantities of interest below (conditioning, loss of symmetry, etc)
    #Omitted

Note on LAPACK routines used

As mentioned above I had to implement this with raw LAPACK. There are many routines for computing orthogonal factorizations and doing various other calculations with LAPACK. I list all the routines I used below so the interested reader may easily find them by internet search:

  1. xGESV - solve system of linear equations using LU factorization and partial pivoting
  2. xTRTRS - solve triangular system
  3. xGEQRF - QR factorize matrix with householder method
  4. xORMQR - apply Q represented as product of householder reflectors

Code repository

I have made the code available at this repository. Note: I have not fully documented everything there.

Results

To generate these results I computed \(10000\) matrices and computed conditioning, loss of symmetry, non-normality, and finally the errors from a QR solve and a LU solve for the same system of linear equations.

The purpose of the other computations is to establish that this family of matrices encompasses a wide range of potentially problematic numerical behavior for which we might expect a more stable calculation like QR factorization to perform better than a less stable one like LU factorization.

Conditioning of matrices

For this calculation I computed the condition number of each matrix sample and plot their logarithm as a histogram.

ex1

Loss of symmetry

For this calculation I computed the loss of symmetry \( \frac{| A ^ T - A |}{|A + A^T |} \) and plot this as a histogram

ex2

Non-normality

For this calculation I computed the loss of normality \( \frac{| AA ^ T - AA^T |}{|AA^T + A^TA |} \) and plot this as a histogram

ex3

Relative errors of QR-solves versus LU-solves

Finally I generated a random solution \( x \) and manufacture a right-hand-side \( b \) such that \( Ax = b \) and then I try to reconstruct \( x\) using first QR factorization and then again with LU factorization. I plot below their errors on the same scalesex3

and then the same errors but instead expressed as a ratioex4

Conclusion

Despite favorable stability properties the QR factorization did not outperform LU factorization for error when solving the same linear system on this family of matrices. Perhaps there are other families of matrices for which the improved stability of the QR factorization could materially improve on the accuracy of the LU factorization, but it did not for this family despite having a fairly wide range of potentially problematic numerical properties.

Practical implications?

This was just a curiosity of mine and even if QR factorization did turn out to be more accurate that would not still be a very compelling reason to use it instead of LU factorization for solving a system of linear equations. The reason is that one can apply iterative refinement for a negligible extra cost and recover any achievable digits of accuracy (how many digits one can reasonably expect to achieve this way depends on how well conditioned the matrix is).

References

The following books provide complete and total coverage of QR factorization, LU factorization, their mathematical properties and proper implementation. I referenced them liberally while writing this blog post and code.

  1. Accuracy and Stability of Numerical Algorithms
  2. Matrix Computations
  3. LAPACK User’s Guide