SIAM PP22 Minisymposium on Understanding and Exploiting Mixed-Precision Accelerators for High-Performance Computing
This double minisymposium, organized by Mantas Mikaitis and Nick Higham, took place during SIAM Conference on Parallel Processing for Scientific Computing (PP22) which happened virtually on 23-26 February, 2022. Here we provide slides of some of the talks that were delivered during this minisymposium.
Abstract: The growth of domain-specific hardware devices, such as low- and mixed-precision Matrix-Multiply Accumulate (MMA) accelerators (for example Tensor Processing Units and Tensor Cores), motivates several strands of research in scientific computing. First, algorithm designers aim to benefit from the speedup these hardware devices make possible by adapting algorithms, or parts of them, to run in low or mixed precisions. Second, we need to understand the low level details of how the devices implement floating-point arithmetic and to what extent they satisfy floating-point arithmetic standards. Third, new rounding error analysis is being developed to further support the task of finding the best ways to use the accelerators in order to maximize the accuracy of the results. This minisymposium gathers researchers in scientific computing, numerical analysis, and the standardization and testing of floating-point arithmetics to report the latest research on applying and understanding the MMA hardware.
Numerical Behavior of GPU Matrix Multiply-Accumulate Hardware. Massimiliano Fasi, Örebro University, Sweden; Nicholas J. Higham, Mantas Mikaitis, and Srikara Pranesh, The University of Manchester, United Kingdom; Florent Lopez, ANSYS, Inc., U.S.; Theo Mary, Sorbonne Universités and CNRS, France. Abstract.
Mixed Precision in Linear Algebra. Jack J. Dongarra, University of Tennessee and Oak Ridge National Laboratory, U.S.
Challenges of Mixed-Precision Fast Fourier Transforms from the Instruction Level to at Scale Computations. Lukasz Ligowski, NVIDIA, U.S. Abstract.
Double-Word Arithmetic and Accurate Calculation of Euclidean Norms. Vincent Lefèvre, Inria Paris, France; Nicolas Louvet, Université Claude Bernard Lyon 1, France; Jean-Michel Muller, CNRS, France; Joris Picot, Ecole Normale Superieure de Lyon, France; Laurence Rideau, Inria Sophia Antipolis, France. Abstract.
Online and Offline Precision Tuning in Hardware Accelerators. George A. Constantinides, Imperial College London, United Kingdom. Abstract.
Reducing Data Movement in Mixed Precision LU Factorization with GPU Tensor Cores. Atef Dorai, LIP-ENS Lyon, France; Roman Iakymchuk, Sorbonne Université, France and Fraunhofer ITWM, Germany; Florent Lopez, ANSYS, Inc., U.S.; Theo Mary, Sorbonne Universités and CNRS, France. Abstract.
BLIS: Mixing, Matching, and Extending Precision. Greg Henry, Intel Corporation, U.S.; Devin Matthews, Southern Methodist University, U.S.; Maggie E. Myers, Devangi N. Parikh, Robert A. van de Geijn, and Field G. Van Zee, University of Texas at Austin, U.S. Abstract.
Fluid Simulations Accelerated with 16 bit: Approaching 4x Speedup on A64FX by Squeezing ShallowWaters.jl into Float16. Milan Kloewer, University of Oxford, United Kingdom; Sam Hatfield, European Center for Medium-Range Weather Forecasts (ECMWF) ; Matteo Croci, University of Oxford, United Kingdom; Peter D. Dueben, European Weather Centre, United Kingdom; Tim Palmer, University of Oxford, United Kingdom. Abstract.