Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric separators #20324

Merged
merged 11 commits into from
Dec 9, 2017
4 changes: 4 additions & 0 deletions src/compiler/diagnosticMessages.json
Original file line number Diff line number Diff line change
Expand Up @@ -3420,6 +3420,10 @@
"category": "Message",
"code": 6187
},
"Numeric separators are not allowed here.": {
"category": "Error",
"code": 6188
},
"Variable '{0}' implicitly has an '{1}' type.": {
"category": "Error",
"code": 7005
Expand Down
127 changes: 107 additions & 20 deletions src/compiler/scanner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -561,9 +561,9 @@ namespace ts {
return false;
}

function scanConflictMarkerTrivia(text: string, pos: number, error?: ErrorCallback) {
function scanConflictMarkerTrivia(text: string, pos: number, error?: (diag: DiagnosticMessage, pos?: number, len?: number) => void) {
if (error) {
error(Diagnostics.Merge_conflict_marker_encountered, mergeConflictMarkerLength);
error(Diagnostics.Merge_conflict_marker_encountered, pos, mergeConflictMarkerLength);
}

const ch = text.charCodeAt(pos);
Expand Down Expand Up @@ -852,34 +852,86 @@ namespace ts {
scanRange,
};

function error(message: DiagnosticMessage, length?: number): void {
function error(message: DiagnosticMessage): void;
function error(message: DiagnosticMessage, errPos: number, length: number): void;
function error(message: DiagnosticMessage, errPos: number = pos, length?: number): void {
if (onError) {
const oldPos = pos;
pos = errPos;
onError(message, length || 0);
pos = oldPos;
}
}

function scanNumberFragment(): string {
let start = pos;
let allowSeparator = false;
let result = "";
while (true) {
const ch = text.charCodeAt(pos);
if (ch === CharacterCodes._) {
tokenFlags |= TokenFlags.ContainsSeparator;
if (allowSeparator) {
allowSeparator = false;
result += text.substring(start, pos);
}
else {
error(Diagnostics.Numeric_separators_are_not_allowed_here, pos, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Multiple consecutive numeric separators are not permitted."

}
pos++;
start = pos;
continue;
}
if (isDigit(ch)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the similarities between this, scanHexDigit, and scanBinaryOrOctalDigits perhaps we should unify (at least in part) these three functions by having scanNumberFragment take a callback for isDigit and then calling it as scanNumberFragment(isDigit), scanNumberFragment(isHexDigit), scanNumberFragment(isBinaryDigit), and scanNumberFragment(isOctalDigit).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scanBinaryOrOctalDigits and scanHexDigit also do arithmetic to calculate the actual numeric value, however (rather than relying on the host being able to parse the literal correctly). I would also need to pass thru another callback to retain that behavior, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scanHexDigit also has to deal with validating that there are a certain number of digits and have an option to not recognize separators, for when its used while handling escape sequences. IMO, while the general structure is similar, they're distinct enough to not warrant a shared base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you can pass in a base of type 2 | 8 | 10 | 16 like in scanBinaryOrOctalDigits and special case hex characters for base 16?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth keeping focus on performance when parsing numbers, even at a cost of flexibility. Too hot a code path.

allowSeparator = true;
pos++;
continue;
}
break;
}
if (text.charCodeAt(pos - 1) === CharacterCodes._) {
error(Diagnostics.Numeric_separators_are_not_allowed_here, pos - 1, 1);
}
return result + text.substring(start, pos);
}

function scanNumber(): string {
const start = pos;
while (isDigit(text.charCodeAt(pos))) pos++;
const mainFragment = scanNumberFragment();
let decimalFragment: string;
let scientificFragment: string;
if (text.charCodeAt(pos) === CharacterCodes.dot) {
pos++;
while (isDigit(text.charCodeAt(pos))) pos++;
decimalFragment = scanNumberFragment();
}
let end = pos;
if (text.charCodeAt(pos) === CharacterCodes.E || text.charCodeAt(pos) === CharacterCodes.e) {
pos++;
tokenFlags |= TokenFlags.Scientific;
if (text.charCodeAt(pos) === CharacterCodes.plus || text.charCodeAt(pos) === CharacterCodes.minus) pos++;
if (isDigit(text.charCodeAt(pos))) {
pos++;
while (isDigit(text.charCodeAt(pos))) pos++;
end = pos;
const preNumericPart = pos;
const finalFragment = scanNumberFragment();
if (!finalFragment) {
error(Diagnostics.Digit_expected);
}
else {
error(Diagnostics.Digit_expected);
scientificFragment = text.substring(end, preNumericPart) + finalFragment;
end = pos;
}
}
if (tokenFlags & TokenFlags.ContainsSeparator) {
let result = mainFragment;
if (decimalFragment) {
result += "." + decimalFragment;
}
if (scientificFragment) {
result += scientificFragment;
}
return "" + +result;
}
else {
return "" + +(text.substring(start, end)); // No need to use all the fragments; no _ removal needed
}
return "" + +(text.substring(start, end));
}

function scanOctalDigits(): number {
Expand All @@ -894,23 +946,36 @@ namespace ts {
* Scans the given number of hexadecimal digits in the text,
* returning -1 if the given number is unavailable.
*/
function scanExactNumberOfHexDigits(count: number): number {
return scanHexDigits(/*minCount*/ count, /*scanAsManyAsPossible*/ false);
function scanExactNumberOfHexDigits(count: number, canHaveSeparators: boolean): number {
return scanHexDigits(/*minCount*/ count, /*scanAsManyAsPossible*/ false, canHaveSeparators);
}

/**
* Scans as many hexadecimal digits as are available in the text,
* returning -1 if the given number of digits was unavailable.
*/
function scanMinimumNumberOfHexDigits(count: number): number {
return scanHexDigits(/*minCount*/ count, /*scanAsManyAsPossible*/ true);
function scanMinimumNumberOfHexDigits(count: number, canHaveSeparators: boolean): number {
return scanHexDigits(/*minCount*/ count, /*scanAsManyAsPossible*/ true, canHaveSeparators);
}

function scanHexDigits(minCount: number, scanAsManyAsPossible: boolean): number {
function scanHexDigits(minCount: number, scanAsManyAsPossible: boolean, canHaveSeparators: boolean): number {
let digits = 0;
let value = 0;
let allowSeparator = false;
while (digits < minCount || scanAsManyAsPossible) {
const ch = text.charCodeAt(pos);
if (canHaveSeparators && ch === CharacterCodes._) {
tokenFlags |= TokenFlags.ContainsSeparator;
if (allowSeparator) {
allowSeparator = false;
}
else {
error(Diagnostics.Numeric_separators_are_not_allowed_here, pos, 1);
}
pos++;
continue;
}
allowSeparator = canHaveSeparators;
if (ch >= CharacterCodes._0 && ch <= CharacterCodes._9) {
value = value * 16 + ch - CharacterCodes._0;
}
Expand All @@ -929,6 +994,9 @@ namespace ts {
if (digits < minCount) {
value = -1;
}
if (text.charCodeAt(pos - 1) === CharacterCodes._) {
error(Diagnostics.Numeric_separators_are_not_allowed_here, pos - 1, 1);
}
return value;
}

Expand Down Expand Up @@ -1097,7 +1165,7 @@ namespace ts {
}

function scanHexadecimalEscape(numDigits: number): string {
const escapedValue = scanExactNumberOfHexDigits(numDigits);
const escapedValue = scanExactNumberOfHexDigits(numDigits, /*canHaveSeparators*/ false);

if (escapedValue >= 0) {
return String.fromCharCode(escapedValue);
Expand All @@ -1109,7 +1177,7 @@ namespace ts {
}

function scanExtendedUnicodeEscape(): string {
const escapedValue = scanMinimumNumberOfHexDigits(1);
const escapedValue = scanMinimumNumberOfHexDigits(1, /*canHaveSeparators*/ false);
let isInvalidExtendedEscape = false;

// Validate the value of the digit
Expand Down Expand Up @@ -1162,7 +1230,7 @@ namespace ts {
if (pos + 5 < end && text.charCodeAt(pos + 1) === CharacterCodes.u) {
const start = pos;
pos += 2;
const value = scanExactNumberOfHexDigits(4);
const value = scanExactNumberOfHexDigits(4, /*canHaveSeparators*/ false);
pos = start;
return value;
}
Expand Down Expand Up @@ -1218,8 +1286,22 @@ namespace ts {
// For counting number of digits; Valid binaryIntegerLiteral must have at least one binary digit following B or b.
// Similarly valid octalIntegerLiteral must have at least one octal digit following o or O.
let numberOfDigits = 0;
let separatorAllowed = false;
while (true) {
const ch = text.charCodeAt(pos);
// Numeric seperators are allowed anywhere within a numeric literal, except not at the beginning, or following another separator
if (ch === CharacterCodes._) {
tokenFlags |= TokenFlags.ContainsSeparator;
if (separatorAllowed) {
separatorAllowed = false;
}
else {
error(Diagnostics.Numeric_separators_are_not_allowed_here, pos, 1);
}
pos++;
continue;
}
separatorAllowed = true;
const valueOfCh = ch - CharacterCodes._0;
if (!isDigit(ch) || valueOfCh >= base) {
break;
Expand All @@ -1232,6 +1314,11 @@ namespace ts {
if (numberOfDigits === 0) {
return -1;
}
if (text.charCodeAt(pos - 1) === CharacterCodes._) {
// Literal ends with underscore - not allowed
error(Diagnostics.Numeric_separators_are_not_allowed_here, pos - 1, 1);
return value;
}
return value;
}

Expand Down Expand Up @@ -1435,7 +1522,7 @@ namespace ts {
case CharacterCodes._0:
if (pos + 2 < end && (text.charCodeAt(pos + 1) === CharacterCodes.X || text.charCodeAt(pos + 1) === CharacterCodes.x)) {
pos += 2;
let value = scanMinimumNumberOfHexDigits(1);
let value = scanMinimumNumberOfHexDigits(1, /*canHaveSeparators*/ true);
if (value < 0) {
error(Diagnostics.Hexadecimal_digit_expected);
value = 0;
Expand Down
3 changes: 2 additions & 1 deletion src/compiler/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1490,8 +1490,9 @@ namespace ts {
HexSpecifier = 1 << 6, // e.g. `0x00000000`
BinarySpecifier = 1 << 7, // e.g. `0b0110010000000000`
OctalSpecifier = 1 << 8, // e.g. `0o777`
ContainsSeparator = 1 << 9, // e.g. `0b1100_0101`
BinaryOrOctalSpecifier = BinarySpecifier | OctalSpecifier,
NumericLiteralFlags = Scientific | Octal | HexSpecifier | BinarySpecifier | OctalSpecifier
NumericLiteralFlags = Scientific | Octal | HexSpecifier | BinarySpecifier | OctalSpecifier | ContainsSeparator
}

export interface NumericLiteral extends LiteralExpression {
Expand Down
2 changes: 1 addition & 1 deletion src/compiler/utilities.ts
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,7 @@ namespace ts {
export function getLiteralText(node: LiteralLikeNode, sourceFile: SourceFile) {
// If we don't need to downlevel and we can reach the original source text using
// the node's parent reference, then simply get the text as it was originally written.
if (!nodeIsSynthesized(node) && node.parent) {
if (!nodeIsSynthesized(node) && node.parent && !(isNumericLiteral(node) && node.numericLiteralFlags & TokenFlags.ContainsSeparator)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should emit them as is for ESNext i would say..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acctually ES2018 and later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an es2018 target now or wait until the ES2018 spec is official?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhegazy said offline that we should open an issue to add an ES2018 target, since there are some lib additions to make in addition to adding an output mode that preserves these separators.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #20342 to track.

return getSourceTextOfNodeFromSourceFile(sourceFile, node);
}

Expand Down
12 changes: 12 additions & 0 deletions tests/baselines/reference/parser.numericSeparators.binary.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
//// [parser.numericSeparators.binary.ts]
0b00_11;
0B0_1;
0b1100_0011;
0B0_11_0101;


//// [parser.numericSeparators.binary.js]
3;
1;
195;
53;
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
=== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/parser.numericSeparators.binary.ts ===
0b00_11;
No type information for this code.0B0_1;
No type information for this code.0b1100_0011;
No type information for this code.0B0_11_0101;
No type information for this code.
No type information for this code.
13 changes: 13 additions & 0 deletions tests/baselines/reference/parser.numericSeparators.binary.types
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
=== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/parser.numericSeparators.binary.ts ===
0b00_11;
>0b00_11 : 3

0B0_1;
>0B0_1 : 1

0b1100_0011;
>0b1100_0011 : 195

0B0_11_0101;
>0B0_11_0101 : 53

Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/1.ts(1,5): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/2.ts(1,3): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/3.ts(1,2): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/3.ts(1,3): error TS1005: ';' expected.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/3.ts(1,3): error TS2304: Cannot find name 'B0101'.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/4.ts(1,6): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/5.ts(1,13): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/6.ts(1,3): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/6.ts(1,4): error TS6188: Numeric separators are not allowed here.
tests/cases/conformance/parser/ecmascriptnext/numericSeparators/6.ts(1,5): error TS6188: Numeric separators are not allowed here.


==== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/1.ts (1 errors) ====
0b00_
~
!!! error TS6188: Numeric separators are not allowed here.

==== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/2.ts (1 errors) ====
0b_110
~
!!! error TS6188: Numeric separators are not allowed here.

==== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/3.ts (3 errors) ====
0_B0101
~
!!! error TS6188: Numeric separators are not allowed here.
~~~~~
!!! error TS1005: ';' expected.
~~~~~
!!! error TS2304: Cannot find name 'B0101'.

==== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/4.ts (1 errors) ====
0b01__11
~
!!! error TS6188: Numeric separators are not allowed here.

==== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/5.ts (1 errors) ====
0B0110_0110__
~
!!! error TS6188: Numeric separators are not allowed here.

==== tests/cases/conformance/parser/ecmascriptnext/numericSeparators/6.ts (3 errors) ====
0b___0111010_0101_1
~
!!! error TS6188: Numeric separators are not allowed here.
~
!!! error TS6188: Numeric separators are not allowed here.
~
!!! error TS6188: Numeric separators are not allowed here.

Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
//// [tests/cases/conformance/parser/ecmascriptnext/numericSeparators/parser.numericSeparators.binaryNegative.ts] ////

//// [1.ts]
0b00_

//// [2.ts]
0b_110

//// [3.ts]
0_B0101

//// [4.ts]
0b01__11

//// [5.ts]
0B0110_0110__

//// [6.ts]
0b___0111010_0101_1


//// [1.js]
0;
//// [2.js]
6;
//// [3.js]
0;
B0101;
//// [4.js]
7;
//// [5.js]
102;
//// [6.js]
1867;
Loading