# Solving Systems of Linear Equations

In this example we will solve a system of linear equations $$Ax=b$$. For this purpose we first compute the $$LU$$ factorization of $$A$$ with lapack::trf and then use the triangular solver blas::sm to finish the job. In a later example we show how to call the LAPACK driver function lapack::sv which does internnally what you see here.

## Example Code

#include <iostream>
#include <flens/flens.cxx>

using namespace std;
using namespace flens;

typedef double   T;

int
main()
{
typedef GeMatrix<FullStorage<T> >           Matrix;
typedef DenseVector<Array<T> >              Vector;

typedef Matrix::IndexType                   IndexType;
typedef DenseVector<Array<IndexType> >      IndexVector;

const Underscore<IndexType> _;

const IndexType m = 4,
n = 5;

Matrix            Ab(m, n);
IndexVector       piv(m);

Ab =  2,   3,  -1,   0,  20,
-6,  -5,   0,   2, -33,
2,  -5,   6,  -6, -43,
4,   6,   2,  -3,  49;

cout << "Ab = " << Ab << endl;

lapack::trf(Ab, piv);

const auto A = Ab(_,_(1,m));
auto       B = Ab(_,_(m+1,n));

blas::sm(Left, NoTrans, T(1), A.upper(), B);

cout << "X = " << B << endl;
}

With header flens.cxx all of FLENS gets included.

#include <flens/flens.cxx>

Define some convenient typedefs for the matrix/vector types of our system of linear equations.

typedef GeMatrix<FullStorage<T> >           Matrix;
typedef DenseVector<Array<T> >              Vector;

We also need an extra vector type for the pivots. The type of the pivots is taken for the system matrix.

typedef Matrix::IndexType                   IndexType;
typedef DenseVector<Array<IndexType> >      IndexVector;

Define an underscore operator for convenient matrix slicing

const Underscore<IndexType> _;

Set up the baby problem ...

const IndexType m = 4,
n = 5;

Compute the $$LU$$ factorization with lapack::trf

lapack::trf(Ab, piv);

Solve the system of linear equation $$Ax =B$$ using blas::sm

const auto A = Ab(_,_(1,m));
auto       B = Ab(_,_(m+1,n));

## Compile

$shell> cd flens/examples$shell> g++ -std=c++11 -Wall -I../.. -o lapack-getrf lapack-getrf.cc

## Run

$shell> cd flens/examples$shell> ./lapack-getrf
Ab =
2             3            -1             0            20
-6            -5             0             2           -33
2            -5             6            -6           -43
4             6             2            -3            49
X =
1
9
9
9

## Using QD (Double-Double and Quad-Double Package)

The following description is taken from the QD Library page:

The QD Library supports both a double-double datatype (approx. 32 decimal digits) and a quad-double datatype (approx. 64 decimal digits). The computational library is written in C++. Both C++ and Fortran-90 high-level language interfaces are provided to permit one to use convert an existing C++ or Fortran-90 program to use the library with only minor changes to the source code. In most cases only a few type statements and (for Fortran-90 programs) read/write statements need to be changed. PSLQ and numerical quadrature programs are included.

### Modified Example Code

#include <iostream>

#include <qd/qd_real.h>
#include <qd/fpu.h>

#include <flens/flens.cxx>

using namespace std;
using namespace flens;

typedef qd_real   T;

int
main()
{
unsigned int old_cw;
fpu_fix_start(&old_cw);

typedef GeMatrix<FullStorage<T> >           Matrix;
typedef DenseVector<Array<T> >              Vector;
typedef Matrix::IndexType                   IndexType;
typedef DenseVector<Array<IndexType> >      IndexVector;

const Underscore<IndexType> _;

const IndexType m = 4,
n = 5;

Matrix            Ab(m, n);
IndexVector       piv(m);

Ab =  2,   3,  -1,   0,  20,
-6,  -5,   0,   2, -33,
2,  -5,   6,  -6, -43,
4,   6,   2,  -3,  49;

cout << "Ab = " << Ab << endl;

lapack::trf(Ab, piv);

const auto A = Ab(_,_(1,m));
auto       B = Ab(_,_(m+1,n));

blas::sm(Left, NoTrans, T(1), A.upper(), B);

cout << "X = " << B << endl;

fpu_fix_end(&old_cw);
}

Only a few modifications were made which will be pointed out in the following:

#include <qd/qd_real.h>
#include <qd/fpu.h>

For using quad-double precision in this example change the typedef to

typedef qd_real   T;

For understanding the next code lines we first take a look at the QD Library documentation:

The algorithms in the QD library assume IEEE double precision floating point arithmetic. Since Intel x86 processors have extended (80-bit) floating point registers, the round-to-double flag must be enabled in the control word of the FPU for this library to function properly under x86 processors. The function fpu_fix_start turns on the round-to-double bit in the FPU control word, while fpu_fix_end will restore the original state.

So the first thing we do in main is turning on the correct rounding ...

int
main()
{
unsigned int old_cw;
fpu_fix_start(&old_cw);

... and at the end restore FPU rounding behavior as mentioned above.

fpu_fix_end(&old_cw);
}

### Compile and Run

$shell> cd flens/examples$shell> g++ -Wall -std=c++11 -I../.. -I/opt/local/include/ -c qd-lapack-getrf.cc
$shell> g++ -o qd-lapack-getrf qd-lapack-getrf.o /opt/local/lib/libqd.a$shell> ./qd-lapack-getrf
Ab =
2.000000e+00  3.000000e+00 -1.000000e+00  0.000000e+00  2.000000e+01
-6.000000e+00 -5.000000e+00  0.000000e+00  2.000000e+00 -3.300000e+01
2.000000e+00 -5.000000e+00  6.000000e+00 -6.000000e+00 -4.300000e+01
4.000000e+00  6.000000e+00  2.000000e+00 -3.000000e+00  4.900000e+01
X =
1.000000e+00
9.000000e+00
9.000000e+00
9.000000e+00

## Using mpfr::real (C++ Interface for mpfr)

The following description is taken from the mpfr::real page:

The class mpfr::real is a high-level C++ wrapper for the GNU MPFR library, a C library for multiple-precision floating-point computations with correct rounding.

The objects of mpfr::real can (almost) be used like doubles or other fundamental floating-point types, thus avoiding the use of C functions to manipulate MPFR's low-level data type directly. In addition to all arithmetic and comparison operators available for fundamental floating-point types, mpfr::real also supports the set of mathematical functions for double from math.h/cmath. Finally, std::istream operator >> and std::ostream operator << are implemented for mpfr::reals. This allows to substitute double with mpfr::real with no further modifications of the code in many cases.

### Modified Example Code

#include <iostream>

#define REAL_ENABLE_CONVERSION_OPERATORS
#include <external/real.hpp>
#include <flens/flens.cxx>

using namespace std;
using namespace flens;

typedef mpfr::real<53>   T;

int
main()
{
typedef GeMatrix<FullStorage<T> >           Matrix;
typedef DenseVector<Array<T> >              Vector;
typedef Matrix::IndexType                   IndexType;
typedef DenseVector<Array<IndexType> >      IndexVector;

const Underscore<IndexType> _;

const IndexType m = 4,
n = 5;

Matrix            Ab(m, n);
IndexVector       piv(m);

Ab =  2,   3,  -1,   0,  20,
-6,  -5,   0,   2, -33,
2,  -5,   6,  -6, -43,
4,   6,   2,  -3,  49;

cout << "Ab = " << Ab << endl;

lapack::trf(Ab, piv);

const auto A = Ab(_,_(1,m));
auto       B = Ab(_,_(m+1,n));

blas::sm(Left, NoTrans, T(1), A.upper(), B);

cout << "X = " << B << endl;
}

Include header file for mpfr::real before flens.cxx and make sure that conversion operators are enabled

#define REAL_ENABLE_CONVERSION_OPERATORS
#include <external/real.hpp>
#include <flens/flens.cxx>

Make typedef for using mpfr::real

typedef mpfr::real<53>   T;

### Compile and Run

$shell> cd flens/examples$shell> g++ -Wall -std=c++11 -I../.. -I/opt/local/include -o mpfr-real-lapack-getrf mpfr-real-lapack-getrf.cc -L /opt/local/lib -lmpfr -lgmp
$shell> ./mpfr-real-lapack-getrf Ab = 2.000000e+00 3.000000e+00 -1.000000e+00 0.000000e+00 2.000000e+01 -6.000000e+00 -5.000000e+00 0.000000e+00 2.000000e+00 -3.300000e+01 2.000000e+00 -5.000000e+00 6.000000e+00 -6.000000e+00 -4.300000e+01 4.000000e+00 6.000000e+00 2.000000e+00 -3.000000e+00 4.900000e+01 X = 1.000000e+00 9.000000e+00 9.000000e+00 9.000000e+00 ## For Performance: Link with ATLAS For high performance you have to link with an optimized BLAS implementation like GotoBLAS or ATLAS. For ATLAS pass -DWITH_ATLAS to the compiler:$shell> cd flens/examples
$shell> g++ -DWITH_ATLAS -Wall -std=c++11 -I../.. -I/opt/local/include -o atlas-lapack-getrf lapack-getrf.cc -latlas -lcblas$shell> ./atlas-lapack-getrf
Ab =
2             3            -1             0            20
-6            -5             0             2           -33
2            -5             6            -6           -43
4             6             2            -3            49
X =
1
9
9
9

If you want to check which BLAS function gets called also pass -DDEBUG_CXXBLAS to the compiler and grep and sort for CXXBLAS (note that the problem size in this example is so small that no LEVEL 3 BLAS function gets used):

$shell> cd flens/examples$shell> g++ -DWITH_ATLAS -DCXXBLAS_DEBUG -Wall -std=c++11 -I../.. -I/opt/local/include -o atlas-lapack-getrf lapack-getrf.cc -latlas -lcblas
\$shell> ./atlas-lapack-getrf 2>&1 | grep CXXBLAS | sort -u
CXXBLAS: [ATLAS] cblas_dger
CXXBLAS: [ATLAS] cblas_dscal
CXXBLAS: [ATLAS] cblas_dswap
CXXBLAS: [ATLAS] cblas_dtrsm
CXXBLAS: [ATLAS] cblas_idamax

For optimal performance also pass -DNDEBUG and -O3.