gh-128185: Align the grammar of the `Decimal` string constructor with `float`'s #128315

HarryLHW · 2024-12-28T15:30:51Z

This PR

removes the mention of underscores in the text
changes digits from digit [digit]... to (digit | "_")* digit (digit | "_")*
makes it a .. productionlist::

Issue: Align the grammar of the Decimal string constructor with float's #128185

📚 Documentation preview 📚: https://cpython-previews--128315.org.readthedocs.build/en/128315/library/decimal.html#decimal.Decimal

picnixz

I actually tested something and saw that Decimal("NaN__") is accepted as being a NaN. So, the grammar of NaN should also be changed. Initially, I thought that the grammar would be improved, but it might be easier to simply keep the mention in the text =/ (sorry @skirpichev, but you were probably right).

However, we can definitely use a production list instead.

Doc/library/decimal.rst

picnixz · 2024-12-28T16:27:34Z

Doc/library/decimal.rst

+      exponentpart: indicator [sign] digits
+      infinity: "Infinity" | "Inf"
+      nan: "NaN" [`digits`] | "sNaN" [`digits`]
+      numericvalue: `decimalpart` [`exponentpart`] | `infinity`


Do Sphinx production lists support numeric-value instead or should it be without any weird symbol? if not, use _ instead of - to separate words.

Sphinx does not support -, but it does support _.

Yet another thing I should perhaps add to my list of things to fix upstream... Thanks for checking.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

HarryLHW · 2024-12-28T17:20:22Z

I actually tested something and saw that Decimal("NaN__") is accepted as being a NaN. So, the grammar of NaN should also be changed. Initially, I thought that the grammar would be improved, but it might be easier to simply keep the mention in the text =/ (sorry @skirpichev, but you were probably right).

However, we can definitely use a production list instead.

The implementation is

cpython/Modules/_decimal/_decimal.c

Lines 2135 to 2139 in 2cf396c

    
           for (; j < len; j++) { 
        
               ch = PyUnicode_READ(kind, data, j); 
        
               if (ignore_underscores && ch == '_') { 
        
                   continue; 
        
               }

If I am correct, this ignores every '_' in the string no matter where it is, similar to string.replace('_', ''). Decimal('__I__N__F__') and Decimal('__N__A__N__') are also valid. It is difficult to generalize an expression for NaN or Infinity.

EDIT:
It is not difficult. The expression could be "_"* "N" "_"* "a" "_"* "N" "_"*, but very confusing.

picnixz · 2024-12-28T17:24:18Z

If I am correct, this ignores every '' in the string no matter where it is, similar to string.replace('', ''). Decimal('I__N__F') and Decimal('N__A__N') are also valid. It is difficult to generalize an expression for NaN or Infinity.

Considering this, I think we should keep the previous text. Sorry for this blunder but I didn't know that it also applied to NaN. However, we can use a production list.

HarryLHW · 2024-12-28T17:30:30Z

If I am correct, this ignores every '' in the string no matter where it is, similar to string.replace('', ''). Decimal('I__N__F') and Decimal('N__A__N') are also valid. It is difficult to generalize an expression for NaN or Infinity.

Considering this, I think we should keep the previous text. Sorry for this blunder but I didn't know that it also applied to NaN. However, we can use a production list.

OK. Should I keep the underscores in digits?
digits ::= (digit | "_")* digit (digit | "_")*

picnixz · 2024-12-28T17:49:03Z

OK. Should I keep the underscores in digits?

No, let's just revert the changes. It would be more confusing if we keep it in the grammar (especially considering how the pattern would actually read). Just transform the current grammar into a production list. However, before that, I'll ask some docs expert for their opinion @ncoghlan @willingc (the problem with me is that I wouldn't mind the change but we're already saying "we trim whitespaces" so we're already "pre-processing" the input to ease the grammar specifications).

skirpichev

As it was already noted in the issue thread, I don't think there is a bug.

New docs are inaccurate, as already noted by Benedict. While it's possible to "fix" this, I doubt that will be an improvement: new grammar will be too complex, perhaps misleading and reveal implementation details, which should be rather hidden.

picnixz · 2024-12-29T08:59:33Z

To summarize (unless our docs experts advise differently):

let's not change the grammar or the text
but let's change the grammar markup into a production list markup

I would recommend merging #128323 first and then this one or do both in one PR.

skirpichev

Looks good, except for few grammar fixes.

Though, maybe we want instead reference terms from other grammars, like https://docs.python.org/3/reference/lexical_analysis.html#floating-point-literals does (almost everything is same as for grammar in the float() docs: https://docs.python.org/3/library/functions.html#float).

But lets see what @picnixz think on this first. Maybe for readability it's better to repeat all in decimal docs.

Doc/library/decimal.rst

skirpichev · 2024-12-31T01:58:18Z

Doc/library/decimal.rst

+      infinity: "Infinity" | "Inf"
+      nan: "NaN" [`digits`] | "sNaN" [`digits`]
+      numeric_value: `decimal_part` [`exponent_part`] | `infinity`
+      numeric_string: [`sign`] `numeric_value` | [`sign`] `nan`

   Other Unicode decimal digits are also permitted where ``digit``


Here you can reference grammar term instead. Though, this may add a conflict with #128323, so - you can include that change here as well.

Doc/library/decimal.rst

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>

picnixz · 2024-12-31T18:25:55Z

Maybe for readability it's better to repeat all in decimal docs.

I don't have a strong preference. For instance, I personally prefer digit [digit]... instead of (digit)+ but it's a matter of taste. Now, since we're far from the page about floats, maybe it's better to keep the verbose 0 | 1 | ... etc?

skirpichev · 2025-01-01T01:38:30Z

since we're far from the page about floats, maybe it's better to keep the verbose

Yes that were my thoughts as well. I think it's ok to repeat grammar terms, even if they introduced before on other pages.

HarryLHW added 2 commits December 28, 2024 23:21

Align the grammar of the Decimal string constructor with float's

5cbf217

Use a production list

f5468ad

bedevere-app bot mentioned this pull request Dec 28, 2024

Align the grammar of the Decimal string constructor with float's #128185

Open

bedevere-app bot added docs Documentation in the Doc dir skip news awaiting review labels Dec 28, 2024

picnixz reviewed Dec 28, 2024

View reviewed changes

HarryLHW and others added 2 commits December 29, 2024 00:37

Update Doc/library/decimal.rst

b80f68f

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

Use _ to separate words

9646cd5

ZeroIntensity requested a review from skirpichev December 28, 2024 16:54

revert 'as well as underscores throughout'

f263f6d

skirpichev reviewed Dec 29, 2024

View reviewed changes

Revert the grammar change.

1e13f7c

skirpichev reviewed Dec 31, 2024

View reviewed changes

HarryLHW and others added 2 commits December 31, 2024 22:35

Update Doc/library/decimal.rst

2cd8e6b

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>

Update Doc/library/decimal.rst

015abbe

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>

skirpichev self-requested a review February 6, 2025 02:19

skirpichev removed their request for review February 26, 2025 02:46

Uh oh!

gh-128185: Align the grammar of the Decimal string constructor with float's #128315

Are you sure you want to change the base?

gh-128185: Align the grammar of the Decimal string constructor with float's #128315

Uh oh!

Conversation

HarryLHW commented Dec 28, 2024 • edited by picnixz Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

picnixz Dec 28, 2024

Choose a reason for hiding this comment

Uh oh!

HarryLHW Dec 28, 2024

Choose a reason for hiding this comment

Uh oh!

picnixz Dec 28, 2024

Choose a reason for hiding this comment

Uh oh!

HarryLHW commented Dec 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz commented Dec 28, 2024

Uh oh!

HarryLHW commented Dec 28, 2024

Uh oh!

picnixz commented Dec 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skirpichev left a comment

Choose a reason for hiding this comment

Uh oh!

picnixz commented Dec 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skirpichev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

skirpichev Dec 31, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

picnixz commented Dec 31, 2024

Uh oh!

skirpichev commented Jan 1, 2025

Uh oh!

Uh oh!

gh-128185: Align the grammar of the `Decimal` string constructor with `float`'s #128315

gh-128185: Align the grammar of the `Decimal` string constructor with `float`'s #128315

HarryLHW commented Dec 28, 2024 •

edited by picnixz

Loading

HarryLHW commented Dec 28, 2024 •

edited

Loading

picnixz commented Dec 28, 2024 •

edited

Loading

picnixz commented Dec 29, 2024 •

edited

Loading