How to Print Floating-Point Numbers Accurately (1990) [pdf]

kentonv · on June 11, 2018

Back in 2007-ish I spent a bunch of time trying to figure out how to print floating point numbers losslessly. Annoyingly, there's no printf() format string that says "print exactly as many digits as needed so that it parses back to the same value". You can specify a number of digits to use, but to avoid printing unnecessary digits in all cases (e.g. "0.20000000000000001" for 0.2) you must ask for a maximum of 15 digits, but in some cases you will lose data if you ask for fewer than 17 digits.

After asking around, I came across the Steele & White paper, and the implementation known as `dtoa()` written by David M. Gay. At the time, this seemed to be understood to be the "correct" answer.

But then I looked at the code: http://www.netlib.org/fp/dtoa.c

Uh.

There's a lot to dislike about that code, but arguably the worst thing is that it isn't thread-agnostic. It mutates global variables and protects them with a global mutex. A global mutex lock, just to print a number! Whyyyyyyyy?

So then I tried something different: I wrote some code that would do sprintf() with 15 digits precision first, then parse it with strtod() to see if it came back exact. If not... then I did sprintf() with 17 digits precision... and called it a day.

In benchmarks, this turned out to be just as fast as calling dtoa().

And so that's how the Protocol Buffers library deals with numbers when writing TextFormat:

https://github.com/google/protobuf/blob/ed4321d1cb3319998411...

But the bigger lesson for me was: Transmitting floating-point numbers in text is awful. People have no idea how ridiculously complex this is, because it seems like it ought to be simple. When you send JSON with numbers in it, you are probably invoking code that looks something like dtoa(), over and over and over again. And that's just to write them out; I have no idea how complex the parsing side is.

Please, folks, think of the CPU cycles. When sending numeric data, use a binary format.

gnufx · on June 11, 2018

People often forget about the numeric locale with floating point as text. Long ago, we had European collaborations, e.g. with Anglophone at one end of the network, Francophone or Danish at the other, sometimes "with hilarious consequences". I know %g etc. in C's printf isn't subject to that, but it was a real issue at the time.

kentonv · on June 11, 2018

Yeah, my sprintf()-based code includes some ridiculous hacks to de-localize the output. :(

taneq · on June 11, 2018

> When sending numeric data, use a binary format.

Binary representations of floating point number are a bit of a nest of vipers too, though. You can probably just pipe IEEE floats across the network as bytes, in practice, but it's risky.

It's probably safer in many cases to just transmit things in fixed point.

dbaupp · on June 11, 2018

I can think of a few possible problems with IEEE754-across-the-wire, such as:

- not all hardware supports them

- subnormal numbers are annoying

- NaNs can be tricky to handle, such as signalling and (if payloads are used) security risks through NaN payloads.

Which are you thinking of? Or something else?

jcelerier · on June 11, 2018

> I can think of a few possible problems with IEEE754-across-the-wire, such as:

> - not all hardware supports them

let's be reasonable : how many people who run architectures which don't have any float would actually send floats to them over the network ?

gnufx · on June 11, 2018

The parent didn't say "don't have any float". I have had to translate between little endian IEEE754 and big endian IBM format, roughly copying what HDF did. Protein crystallographers invented their own binary file format rather than just using HDF, and didn't define the binary floating point format, so that the files weren't portable initially.

taneq · on June 12, 2018

I was mainly thinking of byte order issues and handling of NaN / infinities (especially if, say, you're sending a value from one computer with FPU exceptions disabled to another one with exceptions enabled), but I seem to recall reading about other subtle implementation differences which tended to mean that floats aren't always portable between different CPUs even if they're ostensibly all IEEE compliant.

kentonv · on June 11, 2018

Almost everything uses IEEE-754. For those that don't, converting IEEE-754 to the local native format is almost certainly much easier than parsing text. Dealing with NaN-related issues is also only a couple instructions; much easier than parsing text. (Ideally, use a serialization library that already does these things.)

gnufx · on June 11, 2018

It's too long ago to remember details, though the code may still be in use, but I remember the binary conversion being hairy and ill-defined (754 not mapping onto IBM/VAX), maybe even dependent on the FPU settings. Not that textual conversion would be well-defined either, of course.

slededit · on June 12, 2018

754 has won, as has little endian. Even PowerPC has switched. There are IBM mainframes still being made but even they are now bi-endian because they know they've lost the war.

DerekL · on June 11, 2018

You can also use the hexadecimal floating-point textual format.

0x3.0p-2 = 0.75

pjc50 · on June 11, 2018

"%0.17g" is the format string I eventually concluded was the preferred one for this.

kentonv · on June 11, 2018

Unfortunately, that format will print 0.2 as "0.20000000000000001". What we want is the shortest-length string that will parse back to the same original value, which is "0.2".

CamperBob2 · on June 11, 2018

Please, folks, think of the CPU cycles. When sending numeric data, use a binary format.

You were making so much sense up until that point. :(

oleks · on June 10, 2018

Have you seen this POPL'16 paper? https://popl16.sigplan.org/event/popl-2016-papers-printing-f...

tokenrove · on June 11, 2018

Note that the claims in that abstract were retracted because the authors messed up when benchmarking. Note the changed title and abstract here: https://cseweb.ucsd.edu/~lerner/papers/fp-printing-popl16.pd...

simonbyrne · on June 11, 2018

It is really annoying that CS conference proceedings don’t have a mechanism for retraction or post-publication errata.

marcandrysco · on June 11, 2018

We were able to update the paper on the ACM website: https://dl.acm.org/citation.cfm?id=2837654, although looking at it now, they did not update the title. I'll see if they can fix that.

There is very exciting news coming up with printing floating point. Ulf Adams from Google will be presenting a new algorithm called Ryu that appears to be super fast, simple, and perfectly accurate. Assuming the claims are correct, Ryu ought displace all of the current algorithms.

simonbyrne · on June 11, 2018

Thanks! My comment was not meant as a criticism of your work (I still like the paper, and do appreciate that you went to the effort of putting the amended results online), but more of the CS conference publishing model.

I look forward to reading the Ryu paper.

bloak · on June 11, 2018

Presumably this article from 1990 does not mention C99's hexadecimal floating-point literals, for example:

printf("%a\n", 0x1p-10);

It's a shame that JSON doesn't accept them.

enf · on June 10, 2018

I've been using Milo Yip's implementation of Grisu2, which references another paper: https://github.com/miloyip/dtoa-benchmark

saagarjha · on June 11, 2018

For anyone looking at a practical implementation, Swift recently migrated to a Grisu2-based algorithm: https://github.com/apple/swift/commit/97a934c412a26f0222f57d...

dsamarin · on June 11, 2018

For reference and discussion, skimming the docs, the algorithm is described here [0] and contains some improvements suggested by the "Errol paper"[1]

[0]: http://www.cs.tufts.edu/comp/150FP/archive/florian-loitsch/p...

[1]: https://news.ycombinator.com/item?id=10915182

RossBencina · on June 11, 2018