Fix Issue 20288 - std.format with separator and double causes RangeError #7222

burner · 2019-10-10T14:28:01Z

std.format with separator and double causes RangeError

https://issues.dlang.org/show_bug.cgi?id=20288

dlang-bot · 2019-10-10T14:28:04Z

Thanks for your pull request, @burner!

Bugzilla references

Auto-close	Bugzilla	Severity	Description
✓	20288	enhancement	std.format double with NaN fails with range violation on comma

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#7222"

std/format.d

CyberShadow · 2019-10-10T20:53:29Z

Please put the issue title in the commit/PR title, it's impossible to know what "Fix Issue 20288" means without further action.

wilzbach · 2019-10-14T13:17:01Z

Robert Schadek: I need a win32_64 box

This should help -> #7226 (Azure 10 parallel boxes and thus drastically faster start times).

ghost · 2019-10-20T12:34:25Z

Shouldn't "-nan" not exist at all? I'm aware, that there are several ways to express nans in the IEEE standard, but I always thought, that D does not distinguish between them and therefore should still output "nan" instead of "-nan" when the sign bit of a nan is set.

I'd like to point to issue 19878 which questions the result "-nan" when using writeln.

Is there anything official about using -nan or not? I was not able to find anything useful.

burner · 2019-10-20T16:36:24Z

@berni44 I looked at the standard. Section 6.3 of 754 2008 states that the standard does not interpret the sign bit of NaNs, but some operation specify it. IMHO phobos should therefore display the sign.

ghost · 2019-10-22T08:48:25Z

rikki cattermole wrote in the Forum in a related thread, that the behaviour of printf is the baseline. According to this -nan is correct and issue 19878 should be marked "invalid".

burner · 2019-10-22T09:22:48Z

printf on which platfrom, win32 win64, glibc, bsd? they all do things differently.

atilaneves · 2019-10-22T09:25:12Z

printf on which platfrom, win32 win64, glibc, bsd? they all do things differently.

Shouldn't that be covered by the C standard?

burner · 2019-10-22T09:43:08Z

I don't have the latest standard at hand, but in the latest publicly available one
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1570.pdf

on page 312 it states that

Adouble argument representing a NaN is converted in one of the styles
[-]nan or [-]nan(n-char-sequence) — which style, and the meaning of
any n-char-sequence, is implementation-defined.

The phrase implementation-defined appears quite a lot in the paragraph.
real values are not mentioned at all.

edit: windows does some really strange things. "1.#INF" is a thing

More interesting is footnote 277

When applied to infinite and NaN values, the -, +, and space flag characters have their
usual meaning

I therefore think that the implemented behavior of this PR is "correct" in that it makes all platforms act as linux 64bit.
Which I turn IMHO is the sanest of them all (sign is preserved)

ghost · 2019-10-22T10:39:24Z

Interesting. So printf and the C documentation is not really a reliable source...

Maybe we (= D community) have to define ourselves, what format() should do and clearly document that. Looks like lot of work...

burner · 2019-10-22T10:46:04Z

eventually yes, but not in this PR. This PR fixes a crash and makes a minor thing behave equal across all platforms.

Doing the thing you mentioned, would firstly require phobos to have its own float formatting. Currently we call snprintf.

But again, the this PR does a tiny narrow thing. And shouldn't grow to do all that.

ghost · 2019-10-22T10:57:40Z

Yes, I agree. In my oppinion, this PR can be merged.

Beyond this PR, for me the question is, what's better: Do more of such corrections or provide a complete rewrite of formattedWrite and Co. using regex for parsing the format string, providing our own float formatting and so on. When looking into bugzilla, there are lots of bugs in here and correcting them might be more work than a sound rewrite.

burner · 2019-10-22T11:02:07Z

I think doing our own formatImpl from float/double/real to string would be pretty good already.

edit:
the formatspec parsing is already good. IMHO regex would make it only worse.
IMO most bug reports come from an under-specified printf leading to different opinions.

ghost · 2019-10-22T11:05:47Z

Having had a closer look at the files, I don't understand how the changes in std.conv are related to this bug.

burner · 2019-10-22T11:06:59Z

windows 32 is doing some really crazy stuff with floats, these changes just add some sanity checks.

burner · 2019-10-22T11:08:22Z

and std.conv.to!string( some_float ) uses std.format

std/format.d

std.format with separator and double causes RangeError trying to find the win32 bug windows nan -nan fix harder workaround name clash fix moving some stuff around making it shorter oh win32_64 what are you doing to me and another try some windows special case and another and again removed some duplicated code make it compile again some debug output I need a win32_64 box and again better infos finding the failing test something strange is going on getting closer I got it maybe less output I think I understand now tighter code making the commit nicer removed an import undoing some debug changes removed an unneeded test

PetarKirov · 2019-10-23T06:49:34Z

std/conv.d

+            wstring ws = to!wstring(a);
+            assert(ws == "42"w, to!string(ws));
+            dstring ds = to!dstring(a);
+            assert(ds == "42"d, to!string(ds));


Is this change necessary?

I would say yes, while working on std.format this broke and I couldn't see the assert inputs. So this will help the next one person who gets gray hair looking at std.format!double.

Thanks for clarifying, I actually missed that you added the inputs as second argument to the asserts.

no worries, thanks for taking an interest.

Maybe checkaction=context should be enabled for DRuntime (and other repos) unittest builds

yes, please

dejlek · 2019-10-30T13:57:43Z

We have discussed yesterday performance issue with float --> string conversion. The default implementation in libC is painfully slow. PyPy is 2x faster (!!) because they have their own implementation. Something like this in Phobos would be welcome: https://www.reddit.com/r/ProgrammingLanguages/comments/930pch/ryu_a_new_algorithm_to_quickly_convert_floating/

burner · 2019-10-30T15:12:14Z

@dejlek not only that, but also the usage of libC is making format fundamentally not pure.
Which stops std.json to(Pretty)String being pure and a whole lot of other stuff as well.

ghost · 2019-10-30T16:12:01Z

@dejlek Thank for pointing to the ryu-algorithm. I'll go though it and have a look if I can use/implement it.

UplinkCoder · 2019-10-30T17:04:08Z

I have done a port of a grisu2 implementation which is quite fast and ctfeable.

Search for fpconv_ctfe.d

marler8997 · 2019-10-30T17:49:03Z

I ported the ryu algorithm to D in about an hour, source here: https://github.com/dragon-lang/mar/blob/master/src/mar/ryu.d

ghost · 2019-10-30T18:49:45Z

Meanwhile I read through the paper on ryu. I may be wrong, but I think this algorithm is not useful for this task, for two reasons:

a) The goal is slightly different to that of printf. While ryu tries to print as few digits as possible, printf prints as much as the format string defines. If that's larger than what ryu computes, the remaining digits will be wrong.
b) Ryu uses a lookuptable which is IMHO far too large for a standard library. For 128bit reals it's about 300K of data, that needs to be stored.

ghost · 2019-10-30T18:51:52Z

@UplinkCoder I'll have a look at the grisu implementations to see if it can be used for this task. But due to my time schedule this will probably have to wait one or two weeks.

moon-chilled · 2019-10-31T06:52:29Z

@marler8997 I'm working on a more robust (than the original) translation that has everything you need to replace the current implementation format() uses. Already support not printing out all the digits; main task now is not to print out numbers with small exponents as scientific notation (that is, 123.45 instead of 1.2345e+2). Then trailing/leading zeroes should be easy enough.

Benchmarks suggest a not quite 2x bump in speed; maybe 1.8x.

marler8997 · 2019-10-31T07:50:49Z

@moon-chilled great. You'll have to share it. I only ported the 32-bit version of ryu. For the 64-bit version I wasn't sure if it should be completely separate like the original did, or if I should try to combine them and re-use common code. I look forward to seeing your port.

moon-chilled · 2019-10-31T08:00:41Z

Here. (Lookup tables end at line 380--I really need to ctfe those, because they kill compile times.)

I was planning on mostly re-using code, once I figured out a good architecture for the 64-bit version.

Oh, also - I forgot; it'll probably also get a decent speedboost once I get the 128-bit multiplies in hardware.

burner · 2019-10-31T08:20:26Z

https://issues.dlang.org/show_bug.cgi?id=19124

MoonlightSentinel · 2020-03-03T15:50:39Z

This caused a regression: https://issues.dlang.org/show_bug.cgi?id=20623

burner requested a review from andralex as a code owner October 10, 2019 14:28

dlang-bot added the Severity:Enhancement label Oct 10, 2019

burner added the Severity:Bug Fix label Oct 10, 2019

John-Colvin reviewed Oct 10, 2019

View reviewed changes

std/format.d Show resolved Hide resolved

CyberShadow changed the title ~~Fix Issue 20288~~ Fix Issue 20288 - std.format with separator and double causes RangeError Oct 10, 2019

burner force-pushed the Issue20288_format_comma_NaN branch from 35e4b91 to 4bf038e Compare October 11, 2019 09:17

burner requested a review from JackStouffer as a code owner October 11, 2019 13:56

burner force-pushed the Issue20288_format_comma_NaN branch 3 times, most recently from e58956d to bba4b0d Compare October 16, 2019 15:47

ghost reviewed Oct 22, 2019

View reviewed changes

std/format.d Outdated Show resolved Hide resolved

burner force-pushed the Issue20288_format_comma_NaN branch from bba4b0d to 2b2ed40 Compare October 22, 2019 11:18

ghost approved these changes Oct 22, 2019

View reviewed changes

PetarKirov approved these changes Oct 23, 2019

View reviewed changes

PetarKirov added the Merge:auto-merge label Oct 23, 2019

dlang-bot merged commit 2f1f7a7 into dlang:master Oct 23, 2019

MoonlightSentinel mentioned this pull request Oct 24, 2019

Makefiles: Enable -checkaction-context for unittest builds dlang/druntime#2836

Merged

8 tasks

ghost mentioned this pull request Nov 5, 2019

Fix Issue 20357 - format should obey space flag when printing nan or inf #7263

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Issue 20288 - std.format with separator and double causes RangeError #7222

Fix Issue 20288 - std.format with separator and double causes RangeError #7222

burner commented Oct 10, 2019

dlang-bot commented Oct 10, 2019

CyberShadow commented Oct 10, 2019

wilzbach commented Oct 14, 2019

ghost commented Oct 20, 2019

burner commented Oct 20, 2019 •

edited

Loading

ghost commented Oct 22, 2019

burner commented Oct 22, 2019

atilaneves commented Oct 22, 2019

burner commented Oct 22, 2019 •

edited

Loading

ghost commented Oct 22, 2019

burner commented Oct 22, 2019

ghost commented Oct 22, 2019

burner commented Oct 22, 2019 •

edited

Loading

ghost commented Oct 22, 2019

burner commented Oct 22, 2019

burner commented Oct 22, 2019

PetarKirov Oct 23, 2019

burner Oct 23, 2019 •

edited

Loading

PetarKirov Oct 23, 2019

burner Oct 23, 2019

MoonlightSentinel Oct 23, 2019

burner Oct 23, 2019

dejlek commented Oct 30, 2019

burner commented Oct 30, 2019

ghost commented Oct 30, 2019

UplinkCoder commented Oct 30, 2019

marler8997 commented Oct 30, 2019

ghost commented Oct 30, 2019

ghost commented Oct 30, 2019

moon-chilled commented Oct 31, 2019 •

edited

Loading

marler8997 commented Oct 31, 2019

moon-chilled commented Oct 31, 2019

burner commented Oct 31, 2019

MoonlightSentinel commented Mar 3, 2020

Fix Issue 20288 - std.format with separator and double causes RangeError #7222

Fix Issue 20288 - std.format with separator and double causes RangeError #7222

Conversation

burner commented Oct 10, 2019

dlang-bot commented Oct 10, 2019

Bugzilla references

Testing this PR locally

CyberShadow commented Oct 10, 2019

wilzbach commented Oct 14, 2019

ghost commented Oct 20, 2019

burner commented Oct 20, 2019 • edited Loading

ghost commented Oct 22, 2019

burner commented Oct 22, 2019

atilaneves commented Oct 22, 2019

burner commented Oct 22, 2019 • edited Loading

ghost commented Oct 22, 2019

burner commented Oct 22, 2019

ghost commented Oct 22, 2019

burner commented Oct 22, 2019 • edited Loading

ghost commented Oct 22, 2019

burner commented Oct 22, 2019

burner commented Oct 22, 2019

PetarKirov Oct 23, 2019

Choose a reason for hiding this comment

burner Oct 23, 2019 • edited Loading

Choose a reason for hiding this comment

PetarKirov Oct 23, 2019

Choose a reason for hiding this comment

burner Oct 23, 2019

Choose a reason for hiding this comment

MoonlightSentinel Oct 23, 2019

Choose a reason for hiding this comment

burner Oct 23, 2019

Choose a reason for hiding this comment

dejlek commented Oct 30, 2019

burner commented Oct 30, 2019

ghost commented Oct 30, 2019

UplinkCoder commented Oct 30, 2019

marler8997 commented Oct 30, 2019

ghost commented Oct 30, 2019

ghost commented Oct 30, 2019

moon-chilled commented Oct 31, 2019 • edited Loading

marler8997 commented Oct 31, 2019

moon-chilled commented Oct 31, 2019

burner commented Oct 31, 2019

MoonlightSentinel commented Mar 3, 2020

burner commented Oct 20, 2019 •

edited

Loading

burner commented Oct 22, 2019 •

edited

Loading

burner commented Oct 22, 2019 •

edited

Loading

burner Oct 23, 2019 •

edited

Loading

moon-chilled commented Oct 31, 2019 •

edited

Loading