Skip to content
This repository has been archived by the owner on May 19, 2018. It is now read-only.

Stage 2: BigInt #588

Merged
merged 4 commits into from
Jun 28, 2017
Merged

Stage 2: BigInt #588

merged 4 commits into from
Jun 28, 2017

Conversation

wdhorton
Copy link
Contributor

Q A
Bug fix? no
Breaking change? no
New feature? yes
Deprecations? no
Spec compliancy? no
Tests added/pass? yes
Fixed tickets #569
License MIT

For the tokenizer, check if the last character is n, in which case it should be treated as a BigInt. If it has a dot or an E, throw, otherwise, advance the position one character. Replace the literal n with "" when parsing the value, and create a new token type bigint.

For the parser, when handling a bigint token, create a node of the new type BigIntLiteral with the given value.

@wdhorton
Copy link
Contributor Author

@littledan would love for you to take a look, let me know what other test cases I should add

@hzoo
Copy link
Member

hzoo commented Jun 18, 2017

Awesome @wdhorton!!

As for test cases can you add hex, octal, binary and then error for exponential form (you already have the error for decimals)?

@@ -0,0 +1 @@
1.0n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can rename the invalid-0 to just invalid-decimal?

README.md Outdated
@@ -142,6 +142,7 @@ require("babylon").parse("code", {
- `numericSeparator` ([proposal](https://github.com/samuelgoto/proposal-numeric-separator))
- `optionalChaining` ([proposal](https://github.com/tc39/proposal-optional-chaining))
- `importMeta` ([proposal](https://github.com/tc39/proposal-import-meta))
- `bigint` ([proposal](https://github.com/tc39/proposal-bigint))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we think about making the plugin name bigInt with camelCase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually started with the plugin named bigInt. But then when it came time to make a new token type, I saw that the RegExp constructor mapped to regex so I figured that BigInt should map to bigint. So then I changed the plugin to match the token type.

But I'm ok with bigInt if you think that's better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just meant for the plugin string not the tt.bigint in the code if that makes sense (I wasn't clear earlier 😄)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bigInt seems right for the plugin if the convention is camelCase, while bigint is a pretty reasonable name for a tag IMO, as typeof 1n is "bigint".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's what I meant!

if (isIdentifierStart(this.fullCharCodeAtPos())) this.raise(this.state.pos, "Identifier directly after number");

const str = this.input.slice(start, this.state.pos).replace(/_/g, "");
const str = this.input.slice(start, this.state.pos).replace(/[_n]/g, "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth putting a comment here to explain we are removing the _ for numeric separators and n for BigInt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sound like a good idea

@babel babel deleted a comment from codecov bot Jun 19, 2017
Copy link

@littledan littledan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For missing test cases: It'd also be good to have a BigInt that isn't exactly representable as a Number, and check that the output is as expected (and the test expectation shouldn't contain any rounded version).

"rawValue": 100,
"raw": "100n"
},
"value": 100

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the "value" (and possibly "rawValue", but I don't know much about what that field is for) fields should contain a string with the value of the BigInt as a decimal number (taking care of other bases). Using a Number as the value seems suboptimal as it'll be rounded; no one should use such a value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rawValue is basically the same as value and raw is just the actual string of the code

    this.addExtra(node, "rawValue", value);
    this.addExtra(node, "raw", this.input.slice(startPos, this.state.end));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@littledan I realized after I pushed this up that using a number as the value would defeat the purpose, since as you said we still couldn't represent 64-bit ints without rounding. I'll change it to strings

README.md Outdated
@@ -142,6 +142,7 @@ require("babylon").parse("code", {
- `numericSeparator` ([proposal](https://github.com/samuelgoto/proposal-numeric-separator))
- `optionalChaining` ([proposal](https://github.com/tc39/proposal-optional-chaining))
- `importMeta` ([proposal](https://github.com/tc39/proposal-import-meta))
- `bigint` ([proposal](https://github.com/tc39/proposal-bigint))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bigInt seems right for the plugin if the convention is camelCase, while bigint is a pretty reasonable name for a tag IMO, as typeof 1n is "bigint".

@@ -678,7 +687,7 @@ export default class Tokenizer extends LocationParser {
} else {
val = parseInt(str, 8);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these parseInt calls are appropriate if isBigInt, as they will round away part of the answer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're very right, that was an oversight on my part

this.state.pos += 2; // 0x
const val = this.readInt(radix);
if (val == null) this.raise(this.state.start + 2, "Expected number in radix " + radix);

if (this.hasPlugin("bigInt")) {
if (this.input.charCodeAt(this.state.pos) === 110) { // 'n'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using 0x6E instead of 110? It makes it more obvious that this refers to U+006E LATIN SMALL LETTER N.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code was already checking the charCodes using non-hex integers, so I was just trying to stay consistent with that. I can go ahead and change it if you think that's best

@@ -644,6 +661,7 @@ export default class Tokenizer extends LocationParser {
const start = this.state.pos;
let octal = this.input.charCodeAt(start) === 48; // '0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same could be done here (0x30 instead of 48)

@@ -661,11 +679,27 @@ export default class Tokenizer extends LocationParser {
if (next === 43 || next === 45) ++this.state.pos; // '+-'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (next === 0x2B || next === 0x2D)

}

if (this.hasPlugin("bigInt")) {
if (next === 110) { // 'n'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

"rawValue": "0b101011101",
"raw": "0b101011101n"
},
"value": "0b101011101"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Babylon handle converting this binary value into a decimal value, or should the transform be responsible for that/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that, and on a practical level I think it makes sense to delay that until the transform. If there was a binary value that exceeded the max value for a number, then in order to convert it we'd have to import a bigint library here. Whereas with the transform we're already going to need to use a library for the operations, so it makes sense to have it do the conversions as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be the same question as for Numeric Separators? We have a babel-plugin-transform-es2015-literals that does binary conversion but only for regular numbers.

https://github.com/babel/babel/pull/5793/files#diff-16974ce04e951262c4c093ce8d8a149cR94

I guess in this case since it's a BigInt, we'd make the new transform handle it. I'd say the transform.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that makes sense; I didn't realize this was typically handled in a separate transform.

Does it make sense to have the 'value' field at all, though? Why bother, when you have the rawValue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I guess most of the time value works for all Literals depending on what you want via Boolean, Null, String, etc. I guess for this it's the same thing and can't actually be represented as a number

I think it's fine to include it in the BigInt transform itself if es2015-literals are only going to handle binary, octal, etc for regular numbers.

@wdhorton
Copy link
Contributor Author

just added @mathiasbynens suggestions

@wdhorton
Copy link
Contributor Author

@hzoo @littledan is there anything keeping this from getting merged now?

@hzoo hzoo requested a review from danez June 28, 2017 03:29
@hzoo
Copy link
Member

hzoo commented Jun 28, 2017

Yeah we can iterate when we get to the transform part if it doesn't fit, tests wise it's fine to me!

@hzoo hzoo merged commit baa5f4d into babel:master Jun 28, 2017
@hzoo
Copy link
Member

hzoo commented Jun 28, 2017

Thanks @wdhorton for the PR, let us know if you want help with the transform as well

@wdhorton
Copy link
Contributor Author

@hzoo i have one starting question on the transform—is it ok to pull in an external lib to handle the math operations? idk what babel's policy is on adding 3rd-party dependencies, but it seems unavoidable in this case

@existentialism
Copy link
Member

@wdhorton shouldn't be an issue (guess it depends on the dep?)... we pull in regexpu for transform-es2015-unicode-regex for instance.

@hzoo
Copy link
Member

hzoo commented Jun 28, 2017

Yeah it needs a runtime library like described in #569. It's basically the syntax wrapper -> those calls to the library. It's similar to transform-runtime/regenerator in that way

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants