## Numerical Algorithms for High-Performance Computational Science, Royal Society, London, April 8-9, 2019

This 2-day scientific discussion meeting is being held at the Royal Society, London, UK, and is organized by Jack Dongarra, Laura Grigori and Nick Higham.

The programme consists of invited talks and contributed posters. The invited speakers are

1. Guillaume Aupy, INRIA Bordeaux
2. Erin Carson, Charles University, Prague
3. George Constantinides, Imperial College
4. Steve Furber, FRS, University of Manchester
5. Mike Heroux, Sandia National Laboratories
6. Tony Hey, Science and Technology Facilities Council
7. David Keyes, King Abdullah University of Science and Technology, Saudi Arabia
8. Doug Kothe, Director of the DOE Exascale Computing Project (ECP)
9. Satoshi Matsuoka, RIKEN Center for Computational Science, Tokyo
10. Tim Palmer, FRS, Oxford University
11. Jack Poulson, Hodge Star Scientific Computing, Toronto
12. Anna Scaife, University of Manchester
13. John Shalf, Lawrence Berkeley National Laboratory
14. Rick Stevens, Argonne National Laboratory
15. Michela Taufer, University of Delaware
16. Kathy Yelick, Lawrence Berkeley National Laboratory,

The deadline for submission of poster abstracts is Friday March 1, 2019 (extended from Monday 4 February 2019).

Attendance is free, but places are limited and advance registration is essential.

For more information, including the programme and registration, see https://royalsociety.org/science-events-and-lectures/2019/04/high-performance-computing/

## A new preconditioner exploiting low-rank factorization error

The solution of a linear system $Ax = b$ is a fundamental task in scientific computing. Two main classes of methods to solve such a system exist.

• Direct methods compute a factorization of matrix $A$, such as LU factorization, to then directly obtain the solution $x=U^{-1}L^{-1}b$ by triangular substitution; they are very reliable but also possess a high computational cost, which limits the size of problems that can be tackled.
• Iterative methods compute a sequence of iterates $x_k$ converging towards the solution $x$; they are inexpensive but their convergence and thus reliability strongly depends on the matrix properties, which limits the scope of problems that can be tackled.

A current major challenge in the field of numerical linear algebra is therefore to develop methods that are able to tackle a large scope of problems of large size.

To accelerate the convergence of iterative methods, one usually uses a preconditioner, that is, a matrix $M$ ideally satisfying three conditions: (1) $M$ is cheap to compute; (2) $M$ can be easily inverted; (3) $M^{-1}$ is a good approximation to $A^{-1}$. With such a matrix $M$, the preconditioned system $M^{-1}Ax=M^{1}b$ is then cheap to solve with an iterative method and often requires a small number of iterations only. An example of a widely used class of preconditioners is when $M$ is computed as a low-accuracy LU factorization.

Unfortunately, for many important problems it is quite difficult to find a preconditioner that is both of good quality and cheap to compute, especially when the matrix $A$ is ill conditioned, that is, when the ratio between its largest and smallest singular values is large.

In our paper A New Preconditioner that Exploits Low-rank Approximations to Factorization Error, with Nick Higham, which recently appeared in SIAM Journal of Scientific Computing, we propose a novel class of general preconditioners that builds on an existing, low-accuracy preconditioner $M=A-\Delta A$.

This class of preconditioners is based on the following key observation: ill-conditioned matrices that arise in practice often have a small number of small singular values. The inverse of such a matrix has a small number of large singular values and so is numerically low rank. This observation suggests that the error matrix $E = M^{-1}A - I = M^{-1}\Delta A \approx A^{-1}\Delta A$ is of interest, because we may expect $E$ to retain the numerically low-rank property of $A^{-1}$.

In the paper, we first investigate theoretically and experimentally whether $E$ is indeed numerically low rank; we then describe how to exploit this property to accelerate the convergence of iterative methods by building an improved preconditioner $M(I+\widetilde{E})$, where $\widetilde{E}$ is a low-rank approximation to $E$. This new preconditioner is equal to $A-M(E-\widetilde{E})$ and is thus almost a perfect preconditioner if $\widetilde{E}\approx E$. Moreover, since $\widetilde{E}$ is a low-rank matrix, $(I+\widetilde{E})^{-1}$ can be cheaply computed via the Sherman–Morrison–Woodbury formula, and so the new preconditioner can be easily inverted.

We apply this new preconditioner to three different types of approximate LU factorizations: half-precision LU factorization, incomplete LU factorization (ILU), and block low-rank (BLR) LU factorization. In our experiments with GMRES-based iterative refinement, we show that the new preconditioner can achieve a significant reduction in the number of iterations required to solve a variety of real-life $Ax=b$ problems.

## Research Associate in Numerical Linear Algebra Group – 2 posts

The Numerical Linear Algebra Group at the University of Manchester is seeking two Research Associates to work with Professor Nick Higham on developing and analyzing numerical linear algebra algorithms for current and future high-performance computers.

Topics for investigation include linear equations, linear least squares problems, eigenvalue problems, the singular value decomposition, correlation matrix problems, and matrix function evaluation. This work will exploit multiprecision arithmetic (particularly the fast half precisions available on some recent and forthcoming processors) and techniques such as acceleration and randomization. It will involve using rounding error analysis, statistical analysis, and numerical experiments to obtain new understanding of algorithm accuracy and efficiency.

One of the posts is associated with Professor Higham’s Royal Society Research Professorship. The other post is associated with the EPSRC project Inference, Computation and Numerics for Insights into Cities (ICONIC), which involves Imperial College (Professor Mark Girolami), the University of Manchester (Professor Nick Higham), the University of Oxford (Professor Mike Giles), and the University of Strathclyde (Professor Des Higham).

The closing date is February 11, 2019. For the advert and more details see here.

## Beyer Chair in Applied Mathematics

The School of Mathematics is seeking to appoint an outstanding mathematical scientist to the Beyer Chair in Applied Mathematics.

The Beyer Chair is a senior professorial position in Applied Mathematics, established in 1881 by an endowment from the industrialist Charles Frederick Beyer.

Applicants will have a distinguished track record of research in one or more areas of Applied Mathematics, defined in its broadest sense, and will play a leading role in the life of the School, by providing inspiring leadership in research, and also through appropriate teaching and service activities.

The closing date is March 22, 2019. For advert and more details see here.

## NLA Group Talks at SIAM Conference on Computational Science and Engineering 2019

Members of the Numerical Linear Algebra Group will be giving nine presentations at the upcoming SIAM Computational Science and Engineering (CSE) conference. They are also organizing the two-part minisymposium Advances in Analyzing Floating-point Errors in Computational Science.

The conference will be held at the Spokane Convention Center, Washington, USA, from 25th February to 1st March, 2019.

Here are the dates and times where members of our group will be giving their talks:

Tuesday 26 February
9:45 – 10:05 Sven Hammarling
Standardization of the Batched Blas

Wednesday 27 February
9:45 – 10:05 Theo Mary
A New Approach to Probabilistic Roundoff Error Analysis
14:15 – 14:35 Pierre Blanchard
Algorithm Based Error Analysis for Mixed Precision Matrix Factorizations

Thursday 28 February
8:45 – 9:15 Jack Dongarra
SIAM/ACM Prize in Computational Science and Engineering: The Singular Value Decomposition: Anatomy of an Algorithm, Optimizing for Performance
9:45 – 10:05 Françoise Tisseur
NLEVP: A Collection of Nonlinear Eigenvalue Problems
10:35-10:55 Steven Elsworth
The Rational Krylov Toolbox
14:15 – 14:35 Nick Higham
Exploiting Half Precision Arithmetic in Solving Ax=b
15:05 – 15:25 Jack Dongarra
Experiments with Mixed Precision Algorithms in Linear Algebra
16:10 – 16:30 Mawussi Zounon
Distributed Tasking in the PLASMA Numerical Library

Friday 1 March
9:45 – 11:25 Srikara Pranesh
Domain Decomposition Method for High Dimensional Stochastic Systems

More information on CSE19 is available here.

## Numerical Linear Algebra Group Activities 2018

The Numerical Linear Algebra Group, many of whom are in the October 2018 photo below, had a busy year. This post summarizes what we got up to. Publications are not included here, but many of them can be found on MIMS EPrints under the category Numerical Analysis.

As well as the new group members mentioned below, Stephanie Lai joined the group in September as Executive Assistant to Nick Higham. Stephanie, who holds an MMath degree from the University of Manchester, helps to maintain this website, which launched this autumn. Martin Lotz left the group in August 2018 and is now an Associate Professor at The University of Warwick.

## Software

Together with Jack Dongarra’s team at the University of Tennessee the group is one of the two partners involved in the development of PLASMA: Parallel Linear Algebra Software for Multicore Architectures.

PhD students Weijian Zhang, Steven Elsworth and Jonathan Deakin continued to develop Etymo—a technology for developing knowledge graphs, including Etymo Scholar, a discovery engine for AI research.

We make our research codes available as open source, principally on GitHub; see the repositories of Connolly, Fasi, Higham, Liu, PraneshRelton, Sego, Tisseur, ZhangZounon.

We also put MATLAB software on MATLAB Central File Exchange and on our own web sites, e.g., the Rational Krylov Toolbox (RKToolbox).

## PhD Students

We welcomed new PhD students Michael Connolly, Xiaobo Liu, and Conor Rogers.

Weijian Zhang successfully defended his PhD on Evolving Graphs and similarity-based Graphs with Applications in October 2018.

## Postdoctoral Research Associates (PDRAs)

Theo Mary joined us in January 2018 to work jointly on the ICONIC project.

Mawussi Zounon, who was working on the Parallel Numerical Linear Algebra for Extreme Scale Systems (NLAFET) project, moved over in November 2018 to the position of KTP Associate in a three-year Knowledge Transfer Partnership with NAG.

Jakub Sistek returned to the Czech Academy of Sciences after the completion of the project Programming Model INTERoperability ToWards Exascale (INTERTWinE).

Prashanth Nadukandi completed his Marie Skłodowska-Curie Individual Fellowship in September 2018 and is now a Technical Advisor in Advanced Mathematics at the Repsol Technology Lab, Móstoles, Spain.

Maksims Abalenkovs completed his work on the EPSRC project SERT: Scale-free, Energy-aware, Resilient and Transparent Adaptation of CSE Applications to Mega-core Systems.

## Presentations at Conferences and Workshops

UK and Republic of Ireland Section of SIAM Annual Meeting, National Oceanography Centre, Southampton, January 11, 2018: Blanchard, Nadukandi.

Waseda Research Institute for Science and Engineering Institute for Mathematical Science Kickoff Meeting, Waseda University, Tokyo, March 6, 2018: Higham.

SIAM Conference on Parallel Processing for Scientific Computing, Waseda University, Tokyo, Japan, March 7-10, 2018: Hammarling, Higham, Mary, Zounon.

Conference on Scientific Computing and Approximation, Purdue University, March 30-31, 2018: Higham.

SIAM Conference on Applied Linear Algebra, May 4-8, 2018, Hong Kong Baptist University: Elsworth, Güttel, Zemaityte.

Structured Matrix Days 2018, May 14-15, 2018, ENS Lyon, France: Mary, Tisseur.

Mathematical Modelling, Numerical Analysis and Scientific Computing, Kácov, Czech Republic, May 27-June 1, 2018: Higham, Tisseur.

International Conference on Approximation and Matrix Functions, May 31 and June 1, 2018, Université de Lille, France: Güttel.

Bohemian Matrices and Applications, University of Manchester, June 20-22, 2018: Higham.

International Supercomputing conference (ISC18), Frankfurt, June 24-28, 2018: Hammarling, Zounon.

10th International Workshop on Parallel Matrix Algorithms and Applications (PMAA18), June 27-29, 2018, ETH Zurich: Zounon.

6th IMA Conference on Numerical Linear Algebra and Optimization, June 27-29 2018, University of Birmingham: Blanchard, Mary, Tisseur.

SIAM Annual Meeting, Portland, USA, July 9-13, 2018: Blanchard, Higham, Tisseur.

JuliaCon 2018, London, August 7-11, 2018: Higham.

Sparse Day Meeting 2018, September 26-27 2018, Toulouse, France: Mary.

## Recognition and Service

• Nick Higham gave the Samuel D. Conte Distinguished Lecture at the Department of Computer Science, Purdue University, USA, in March 2018.
• Massimilano Fasi was presented with the SIAM Student Chapter Certificate of Recognition 2018.

## The Paterson–Stockmeyer method for evaluating polynomials and rational functions of matrices

According to Moler and Van Loan, truncating the Taylor series expansion to the exponential at 0 is the least effective of their nineteen dubious ways to compute the exponential of a matrix. Such an undoubtedly questionable approach can, however, become a powerful tool for evaluating matrix functions when used in conjunction with other strategies, such as, for example, the scaling and squaring technique. In fact, truncated Taylor series are just a special case of a much larger class of rational approximants, known as the Padé family, on which many state-of-the-art algorithms rely in order to compute matrix functions.

A customary choice for evaluating these approximants at a matrix argument is the Paterson–Stockmeyer method, an evaluation scheme that was originally proposed as an asymptotically optimal algorithm for evaluating polynomials of matrices, but generalizes quite naturally to rational functions, which are nothing but the solutions of a linear system whose coefficients and right-hand side are polynomials of the same matrix. This technique exploits a clever rewriting of a polynomial in $A$ as a polynomials in $A^s$, for some positive integer $s$, and overall requires about $2\sqrt{k}$ matrix products to evaluate a polynomial of degree $k$.

As shown in the figure, when the Paterson–Stockmeyer scheme is used the number of matrix multiplications required to evaluate a polynomial of degree $k$ grows slower than $k$ itself, with the result that evaluating polynomials of different degree will asymptotically have the same cost. For example, evaluating a polynomial of any degree between 43 and 49 requires 12 matrix multiplications, thus there is little point in considering an approximant of degree 43 when evaluating that of degree 49 has roughly the same cost but will in all likelihood deliver a more accurate result.

When designing algorithms to evaluate functions of matrices, one is interested in finding the optimal degrees, those marked with a red circle in the figure above, since they guarantee maximal accuracy for a given computational cost. When fixed precision is considered, finding all such degrees is not a problem: a backward error analysis can be used to determine the maximum degree that will ever be needed, $m_{max}$ say, and then looking at the plot is enough to find all the optimal degrees smaller than $m_{max}$. In order to deal with arbitrary precision arithmetic, however, a different strategy is needed, as depending on the working precision and the required tolerance, approximants of arbitrarily high degree may be needed. The new Eprint Optimality of the Paterson–Stockmeyer Method for Evaluating Matrix Polynomials and Rational Matrix Functions studies the cost of the Paterson–Stockmeyer method for polynomial evaluation and shows that a degree is optimal if and only if it is a quarter-square, that is, a number of the form $\lfloor n^2/4 \rfloor$ for some nonnegative integer $n$, where $\lfloor \cdot \rfloor$ is the floor function.

Similar results can be obtained for Paterson–Stockmeyer-like algorithms for evaluating diagonal Padé approximants, rational functions whose numerator and denominator have same degree. In that case, one can show that an approximant is optimal if and only if the degree of numerator and denominator is an eight-square, an integer of the form $\lfloor n^2/8 \rfloor$ for some $n \in \mathbb{N}$. In particular, for diagonal Padé approximants to the exponential, due to a symmetry of the coefficients of numerator and denominator, faster algorithms can be developed, and an explicit formula—not as nice as that in the two previous cases—can be derived for the optimal orders of these approximants.