Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretokenise ints/floats #2563

Open
gfwilliams opened this issue Oct 3, 2024 · 0 comments
Open

Pretokenise ints/floats #2563

gfwilliams opened this issue Oct 3, 2024 · 0 comments

Comments

@gfwilliams
Copy link
Member

It used to be that we only pretokenised tokens, and now we tokenise Strings too. We could also pretokenise ints/floats in a similar way as parsing these has a nonzero overhead involving a multiplication by 10 for each digit.

Having a token for int8/int32 would probably be a good start. Tokenising floating point properly would require a double to be stored (so 8 bytes) or it'd affect the final value so likely would make the pretokenised code bigger. There's always an option to store floating point base 10 (maybe 24 bit mantissa + 8 bit exponent) which would accurately store the vast majority of floating point values used in code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant