## Stochastic Rounding Has Unconditional Probabilistic Error Bounds

In the IEEE standard 754 for binary floating-point arithmetic, four rounding modes are defined.

- Round to nearest.
- Round towards .
- Round towards .
- Round towards .

Recently stochastic rounding, an old idea that dates back to the beginning of the digital computer era, has gained popularity, most notably in deep learning. While the rounding modes defined in the IEEE standard are deterministic, stochastic rounding is inherently random. We can define two modes of stochastic rounding. Consider the figure below, where we have a real number and adjacent floating-point numbers and . In what we call mode 1 stochastic rounding, we round to either or with equal probability. In mode 2 stochastic rounding, we round to (or ) with probability proportional to minus the distance of from (or ). So in the example shown, for mode 2 we are more likely to round to than to .

In our recent EPrint Stochastic Rounding and its Probabilistic Backward Error Analysis, Nick Higham, Theo Mary and I generalized previous probabilistic error analysis results [SIAM J. Sci. Comput., 41 (2019), pp. A2815–A2835] to stochastic rounding.

We show that stochastic rounding (specifically mode 2) has the property of producing rounding errors that are random variables satisfying

- mean zero: ,
- mean independent: .

Here, denotes the expectation. A key consequence of these results is that we can replace the worst case error bound proportional to , ubiquitous in backward error analyses, by a more informative probabilistic bound proportional to . What is a rule of thumb for round to nearest becomes a *rule* for stochastic rounding: it is proved that our rounding errors satisfy the above properties.

In the figure below, we compute the inner product of vectors sampled uniformly from in both round to nearest and stochastic rounding in IEEE half precision arithmetic. Shown is the backward error for each value of and the bounds and .

For increasing problem size , with round to nearest the error no longer satisfies the bound. This is due to the phenomenon we call *stagnation*, and low precisions are particularly susceptible to it. As we compute recursively , eventually the partial sum can grow so large that the update is less than half of the spacing of floating-point numbers around and under round to nearest the partial sum does not increase. This means we produce negative rounding errors, which are of course not mean zero. Stochastic rounding avoids this issue by rounding randomly. In fact we can prove that , that is, the expected value of the computed result under stochastic rounding is the exact result.