Allow underscores in numbers. Closes #1677#3120
Allow underscores in numbers. Closes #1677#3120Balajiganapathi wants to merge 0 commit intoargotorg:developfrom
Conversation
libsolidity/parsing/Scanner.cpp
Outdated
There was a problem hiding this comment.
This doesn't prevent the leading underscore which seems to be weird in hex (0x_1234)
There was a problem hiding this comment.
I added that as python allows it:
(an example from https://www.python.org/dev/peps/pep-0515/)
flags = 0b_0011_1111_0100_1110
|
Thanks! Can you please add tests that (assuming we strictly follow PEP-0515):
Deviating from the PEP I'd argue that leading underscores in hex are a nuisance. Also please update the documentation (search for "rational and integer literals"). |
|
Also there's this rule in the PEP:
e.g. |
|
I agree with you that 0x_abc looks weird and not consistent with disallowing leading underscore for decimal numbers.
My understanding is that this is for the output format specifier, when converting from number to string. This is not a constraint on the syntax of a numeric literal. The grammar given does not specify this constraint. I am working on adding the tests and disallowing leading underscore. Is this additional grouping constraint really needed? |
|
@axic I have added more tests and docs. I also tried in python3.6 and it does allow arbitrary grouping of digits. e.g. |
|
@axic do you still want precisely sized groupings? |
|
Even the PEP introductory section suggests that. I think it definitely is a better idea to enforce that (we can always lax the rules, but harder to make them strict once out in a lax version). |
|
@axic Ok will do this then. Just to be clear the following will be invalid: 0x_abcd_cdef (leading underscore) What about the following? |
|
Good questions!
I guess this is fine too (I would personally not allow it, but I guess could be used to show where the fixed point would be after conversion? Tiny use case though). Perhaps this is the point where lax rules gain ground.
I think this is fine. |
|
@axic Thanks One more question :) Since we are allowing 0x123_4567, should we allow 0x1234_567 too? I think allowing only one of these 2 makes sense (and will make implementation slightly easier). |
|
I wouldn't allow |
|
Hi, I am thinking about the best way to implement this. I think this rule will cover all the cases. Or do you want more restrictive ones?
This will take care of all the above cases. But it will also allow such cases: |
|
Due to our ambiguous handling of literals, we should allow both Ok, perhaps to complicate things even further: If the literal ends in 4 hex digits without separator, the length of the first element is irrelevant. If it starts with four hex digits without separator, the last element has to have an even length. So |
|
I'd argue that in a number context one could say leaving the leading 0 nibble out is acceptable (as it is a number), while in a
|
|
Exactly! Although the "context" should not influence whether the literal is valid or not, it should be visible from looking at the literal itself. |
|
I can't follow you, you say Looking at |
|
Ok, sorry, let me clarify: |
|
Also I'd say the parser should not enforce any rules (perhaps apart from stray leading, trailing underscores), but rather have it in the SyntaxChecker. Reason:
|
|
@chriseth @axic , is the following rule fine?
How about in RationalNumberType::isValidLiteral in Types.cpp file?
I could not find this file in this repo :( |
|
Also I am assuming this does not apply to base 10 numerals as the rules of separating are different in different countries. |
|
@axic great idea about syntaxchecker! This would also automatically allow the parsing to continue in case of an error. @Balajiganapathi I would prefer the following rule: If there is no separator at all, it is fine. If the literal ends in 4 hex digits without separator, the length of the first element can be arbitrary. If it starts with four hex digits without separator, the last element has to have an even length. If neither the first nor the last has 4 hex digits without separator, the literal is invalid. |
|
Not able to find out why ci build is failing. It passed in first commit. In next 2 commits I only made some cosmetic changes, yet the ci is showing build failed. I looked at build logs, it looks like it is randomly failing some. |
This failure is down to Travis, you cannot do much against it. |
There was a problem hiding this comment.
Please add an assert that the first two characters are 0x.
There was a problem hiding this comment.
Could you make the error message a little more specific? Something like
Invalid use of underscores in hex literal. Found inner part of length 2 (has to be 4 characters).
There was a problem hiding this comment.
Please also make this error message more specific.
libsolidity/parsing/Scanner.cpp
Outdated
There was a problem hiding this comment.
I think control flow would be clearer if you move the rollback(1) before the if statement (and of course save m_char in a local variable).
There was a problem hiding this comment.
Ok, reading that again - why do you advance if you rollback later anyway?
There was a problem hiding this comment.
Uses length and digits. I think it should just use one term and digits seems to be fine. This applies to the other messages too.
|
Please rebase. |
5f03a1d to
99be022
Compare
|
@pirapira Rebased. |
|
@axic please merge if you are fine with it. |
Changelog.md
Outdated
There was a problem hiding this comment.
Please move this entry to 0.4.20.
99be022 to
58832d9
Compare
|
Rebased. |
|
Moved to 0.5.0 |
|
@ekpyron @bit-shift does one of you want to continue this? There are still some unaddressed comments by @axic. |
|
@chriseth I will fix those this weekend. |
|
Rebased. |
df71ea9 to
9ec3fd1
Compare
|
not sure how I closed this PR? |
Closes #1677.
Follows https://www.python.org/dev/peps/pep-0515/