Tag Archives: floating-point

Plotting Circles to Test Low Precision Arithmetics and Rounding Modes

As part of the 28th IEEE Symposium of Computer Arithmetic (held in July 2021), a special issue of IEEE Transactions on Emerging Topics in Computing on “Emerging and Impacting Trends on Computer Arithmetic” was organized, which has recently been published [1]. With Massimiliano Fasi, we have contributed an article [2] (available Open Access) on simulating stochastic rounding [3] in floating-point arithmetics, and about which we wrote in this blog before [4].

One interesting numerical example which we used to test various arithmetics and rounding modes [Sec. VIII C2, 4] is an ordinary differential equation (ODE)

u'(t)=v(t), \\ v'(t)=-u(t).

With the initial conditions u(0)=1 and v(0)=0, the solution of this ODE lies on the unit circle in the uv-plane [p. 51, 5]. When solving it numerically our is aim to obtain the solution that lies as close as possible to the unit circle. A similar example is approximating orbits of asteroids in Physics [Sec. II A, 8], although it is more complicated since it additionally includes velocity and acceleration.

Using a forward Euler scheme we obtain:

u_{k+1}=u_k+hv_k, \\ v_{k+1}=v_k-hu_k,

where h=\frac{2\pi}{n} is a step size. Now, solving this with n=32 in bfloat16 arithmetic [6] with round to nearest (RN) and stochastic rounding in arithmetic operations, we obtain the solution in Figure 1 (dotted line—exact unit circle, dashed—RN, solid—SR), which spirals out as previously shown [p. 51, 5]. To improve this we could use a different solver, such as Trapezium or Leapfrog [p.51, 5], but for the purposes of testing arithmetics rather than solvers we stick with Euler and try different step sizes.

Figure 1: Unit circle ODE solutions with n=32. The x-axis represents u and the y-axis represents v. Dotted line—exact unit circle, dashed—RN, and solid—SR.

Next, we tried n=2^9 and the plot is shown on the left hand side of Figure 2. Visually, this timestep size for the bfloat16 arithmetic with RN looks ideal since it quite closely approximates the unit circle and the solution does not spiral out as before. SR performs quite well here as well but noticeably worse than RN.

Next, we further reduce the timestep to n=2^{11} and the solution with that is plotted on the right hand side of Figure 2. In this case the solution with RN has been affected by rounding errors—the approximation of the unit circle looks visually as an octagon rather than a circle!

So what happened there? In the paper, we explain this through the problem of stagnation in floating-point arithmetic. It happens when many addends to some relatively large value are small enough so that they are all lost to rounding and the sum stays at some initial or intermediately reached value. In this case, once we start at an initial point u=1 and v=0, we expect that both u and v will start to decrease. However, only v is doing that since in u_{k+1}=u_k+hv_k the term hv_k is too small initially to change u_k.

The same pattern repeats during the whole solution and u and v keep switching places in suffering from stagnation. Since SR is immune to stagnation [7], this issue does not appear.

Figure 2: Unit circle ODE solutions with n=2^{9} (left) and n=2^{11}.

Finally, we reduce the timestep by a further factor of 100 (Figure 3). This revealed that the ODE solution solved in bfloat16 hardly moves away from the initial conditions with RN, but is still quite accurately computing the approximation of the unit circle with SR.

Figure 3: Unit circle ODE solutions with n=2^{13}.

The unit circle ODE constitutes an easy to run experiment to observe stagnation in floating-point arithmetic, which is usually done through recursive summation (also in our paper, [Sec. VIII A-B, 2]), and is a good visually convenient benchmark for testing low precision arithmetics and alternative rounding modes, such as stochastic rounding, and perhaps for teaching students about floating point and stagnation.

Further detail on the unit circle experiments, as well as other experiments with different ODEs and solvers, can be found in Section VIII of our paper [2]. The MATLAB code for the experiments is available at https://github.com/mmikaitis/stochastic-rounding-evaluation.


[1] M. Joldes, F. Lamberti, and A. Nannarelli, Special Section on “Emerging and Impacting Trends on Computer Arithmetic”. IEEE Trans. Emerg. Topics Comput., 9:3. Sep. 2021

[2] M. Fasi and M. Mikaitis, Algorithms for Stochastically Rounded Elementary Arithmetic Operations in IEEE 754 Floating-Point Arithmetic. IEEE Trans. Emerg. Topics Comput., 9:3. Sep. 2021

[3] N. J. Higham, What is Stochastic Rounding?. Jul. 2020

[4] M. Mikaitis, Simulating Stochastically Rounded Floating-Point Arithmetic Efficiently. Nov. 2020

[5] N. J. Higham, “Goals of applied mathematical research” in The Princeton Companion to Applied Mathematics, N. J. Higham, M. R. Dennis, P. Glen- dinning, P. A. Martin, F. Santosa, and J. Tanner, Eds. Princeton, NJ, USA: Princeton Univ. Press, 2015, pp. 48–55.

[6] N. J. Higham, What is Bfloat16 Arithmetic?. Jun. 2020

[7] M. P. Connolly, N. J. Higham, and T. Mary, Stochastic rounding and its probabilistic backward error analysis. SIAM J. Sci. Comput., 43(1), A566–A585. Feb. 2021

[8] D. A. Fauxa and J. Godolphin, The floating point: Tales of the unexpected. Am. J. Phys., 89 (8). Aug. 2021