Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for float literals starting with a period [DO NOT MERGE] #3280

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fingolfin
Copy link
Member

@fingolfin fingolfin commented Feb 11, 2019

Before this patch, writing .0 instead of 0.0 was allowed, as in C, C++, D, Java, Julia, Go, Perl, Python and many other languages. This required a hack in the form of an additional entry pointer for our scanner. We now get rid of it, matching languages like Ada, Haskell, OCaml, Rust, Swift.

The main motivation for this is to reduce the API surface of the GAP scanner, and to remove the need for code using it to emulate the reader (or at least substantial parts of it) in order to provide a fully compliant tokenizer for GAP code. This should simplify external implementations of code formatters and other similar code which wants to "parse" GAP code somehow.

Usability should not really be impacted: first off, most floats don't get entered by hand anyway, and secondly, usage of floats in GAP still is a relative rare thing, I expect only few people use it. Of course some package might use float constants starting with a dot somewhere, but so far I have not found any. In any case, code using such constants should be trivial to adapt, and once adapted, the code will of course still work in older GAP releases just fine, too.

Right now, I can only think of two serious arguments against this (which of course just means that my imagination is limited):

  • some code somebody has might rely on reading a large number of floats from a text file, and some of those floats start with a dot. But I think this would be easy to resolve, e.g. by preprocessing the input with a python script (or plain old sed, plus a regex).
  • breaking changes to language are a major no-go. I'd generally agree with this; but I think floats in GAP are very much a niche feature only, and, as explained above, I expect it to break very little existing code, if any.

I'd be very much interested to hear what others think about this, esp. @laurentbartholdi.

@fingolfin fingolfin added kind: request for comments do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) topic: kernel release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes labels Feb 11, 2019
Copy link
Contributor

@stevelinton stevelinton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible to me. My original idea was to implement all C floating point literals (except the ones that were already integers), but that failed anyway because we wanted 2E6 to be a an identifier rather than a number.

@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from 9ff60d3 to bb15065 Compare February 12, 2019 08:12
@laurentbartholdi
Copy link
Contributor

laurentbartholdi commented Feb 12, 2019 via email

@coveralls
Copy link

coveralls commented Feb 12, 2019

Coverage Status

Coverage decreased (-0.0002%) to 93.729% when pulling 317a1fa on fingolfin:mh/remove-ScanForFloatAfterDotHACK into e62d395 on gap-system:master.

@ChrisJefferson
Copy link
Contributor

Can you give an example of what makes the initial . hard to recognise, or is it just because they were added later in a slightly hacky way?

@stevelinton
Copy link
Contributor

The problem is how to tokenise something like

myrecord.17

without a clue from the parser level, this could be either a name followed by the float .17 or
a name followed by a dot followed by a record component name 17.

The existing hack supplies that hint.

@fingolfin
Copy link
Member Author

@laurentbartholdi I don't see a major problem with that output. Sure, it's not again valid input, but that's the case for many objects (and any dreams to the contrary are doomed). That said, as you say, it's easy to correct, too....

I am not quite sure what you have in mind when you say that you'd "like to check that it doesn't snowball." ?

@laurentbartholdi
Copy link
Contributor

laurentbartholdi commented Feb 12, 2019 via email

@ChrisJefferson
Copy link
Contributor

I'm not super-enthusiastic about this, while I'd be happy to have never accepted .0, I'm a bit worried about breaking it now.

The rules around how ., and floats in general, are parsed are already quite messy and have lots of special cases, so I think they'd be hard to implement for another program.

@fingolfin
Copy link
Member Author

fingolfin commented Feb 12, 2019

@laurentbartholdi that sounds like very fragile code to begin with. In any case, I of course run the float test suite, which still passes -- but it's rather small, and e.g. does not seem to cover printing at all, nor many other functions, so I guess it passing does not mean much. That of course by itself is kind of a problem, too.

@ChrisJefferson the rules are actually not that messy. I refactored the code last year, and it's IMHO pretty clear now, though of course not exactly trivial; but that holds for any code parsing floats, as one has to deal with exponent notation and whatnot.

In any case, as I wrote, part of the motivation is precisely so that code does not have to reimplement the logic, but rather can directly lift the tokenizer (i.e. the code in scanner.c) from GAP and use it. Of course we are not quite there yet with this PR -- but I have another which moves the scanner state into an argument passed to the scanner, so it does no longer access the GAP state at all.

@fingolfin
Copy link
Member Author

But it's good that you made me look again -- I actually can simplify the code some more in this PR, by removing seenADigit.

@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from bb15065 to d6f5450 Compare February 12, 2019 16:05
@codecov
Copy link

codecov bot commented Feb 12, 2019

Codecov Report

Merging #3280 (317a1fa) into master (3af3ebb) will increase coverage by 11.42%.
The diff coverage is 95.65%.

❗ Current head 317a1fa differs from pull request most recent head 6dc3d5d. Consider uploading reports for the commit 6dc3d5d to get more accurate results

@@             Coverage Diff             @@
##           master    #3280       +/-   ##
===========================================
+ Coverage   82.07%   93.49%   +11.42%     
===========================================
  Files         684      716       +32     
  Lines      289913   812223   +522310     
  Branches     8615        0     -8615     
===========================================
+ Hits       237933   759358   +521425     
- Misses      50082    52865     +2783     
+ Partials     1898        0     -1898     
Impacted Files Coverage Δ
src/read.c 97.32% <ø> (+1.14%) ⬆️
src/scanner.h 25.00% <ø> (ø)
src/scanner.c 99.48% <95.65%> (-0.26%) ⬇️
lib/memory.gi 43.12% <0.00%> (-39.92%) ⬇️
lib/methsel.g 51.16% <0.00%> (-24.70%) ⬇️
lib/colorprompt.g 46.15% <0.00%> (-24.16%) ⬇️
lib/matobjplist.gi 42.79% <0.00%> (-19.72%) ⬇️
lib/gasman.gi 26.66% <0.00%> (-18.27%) ⬇️
src/io.c 59.48% <0.00%> (-17.80%) ⬇️
lib/mgmfree.gi 58.53% <0.00%> (-16.47%) ⬇️
... and 366 more

@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from d6f5450 to 562201b Compare February 22, 2019 21:04
@fingolfin fingolfin added kind: discussion discussions, questions, requests for comments, and so on and removed kind: request for comments labels Mar 21, 2019
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from 562201b to b79076a Compare March 26, 2019 22:27
@fingolfin
Copy link
Member Author

I regret that I did not remember to bring this up during the last GAP days.

Anyway, I might whip up a PR which does not disable this, just prints a warning message whenever floats starting with a dot are used. That could be used to flush out any related issues in the float package. As I said, i already tried to find issues with this change inside the float package, but apparently just running the test suite is not enough for that (which is a concern of its own).

Frankly, the whole state of float support in GAP is a bit depressing to me. Most things are not documented (including whether we support parsing .123 or not, as far as I can tell); that makes it super hard to debug all kinds of things, as it's often unclear whether some behavior is intentional or a bug. Hrmpf.

@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from b79076a to aa51bf4 Compare May 6, 2019 13:51
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from aa51bf4 to d8592ec Compare May 14, 2019 21:04
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from d8592ec to f31eb57 Compare August 2, 2019 22:35
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from f31eb57 to f63840f Compare January 27, 2020 17:16
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from f63840f to 9f9c8b6 Compare May 6, 2020 21:08
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from 9f9c8b6 to dddce0b Compare May 22, 2020 15:46
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from dddce0b to 317a1fa Compare October 16, 2020 10:37
Before this patch, writing .0 instead of 0.0 was allowed, as in C, C++, D,
Java, Go, Perl, Python and many other languages. This required a hack in the
form of an additional entry pointer for our scanner. We now get rid of it,
matching languages like Ada, Haskell, OCaml, Rust, Swift.
@fingolfin fingolfin force-pushed the mh/remove-ScanForFloatAfterDotHACK branch from 317a1fa to 6dc3d5d Compare March 28, 2022 21:18
@fingolfin
Copy link
Member Author

@laurentbartholdi the motivation is discussed at length in this PR (which is marked as "DO NOT MERGE", don't worry at this time). If we ever again think making a bigger and partially breaking release of GAP, this is a candidate for inclusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) kind: discussion discussions, questions, requests for comments, and so on release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants