Some tweaks for backtick strings #45099

elibarzilay · 2021-07-19T12:54:15Z

getRawLiteral(): barf if currentSourceFile is missing, since if
it is, then the following getSourceTextOfNodeFromSourceFile will
return a bogus "".
One || -> ?? change.
backtickQuoteEscapedCharsRegExp: escape the usual control
characters except for a simple LF. This code does get used to
generate backtick strings when rawText is not given, and not
escaping things like TAB characters can get mangled by editor
settings. Worse, not escaping a CRLF and putting it verbatim in sthe
string source will interpret it as LF, so add a special case for
escaping these as \r\n.
Added test.

src/compiler/transformers/taggedTemplate.ts

src/compiler/utilities.ts

rbuckton · 2021-07-28T19:32:47Z

src/compiler/utilities.ts

-    // Template strings should be preserved as much as possible
-    const backtickQuoteEscapedCharsRegExp = /[\\`]/g;
+    // Template strings preserve simple LF newlines, still encode CRLF (or CR)
+    const backtickQuoteEscapedCharsRegExp = /\r\n|[\\\`\u0000-\u001f\t\v\f\b\r\u2028\u2029\u0085]/g;


Can you add a test for this change? I'm unclear on why we wouldn't want to preserve these. backtickQuoteEsacpedCharsRegExp is only used when the quote character is CharacterCodes.backtick, which mean's we're preserving the raw text of the template string as much as possible, so it seems like adding extra escapes would break anyone using the raw strings.

I'll see if I can add a test for this.

Re the reason, I listed two reasons in the commit log:

Not escaping things like a TAB or a FF character can get mangled by editor style settings

Not escaping a CRLF newline and instead having the template literal have a CRLF will be interpreted as a regular newline, and result in a string with just LF. (It's even more likely to get mangled by your editor, but even if not you'll get that problem.)

Not escaping things like a TAB or a FF character can get mangled by editor style settings

Yeah, I can see that being a possible issue, however I'm still not sure "<TAB>foo" should become `\tfoo` since it is still semantically different between the raw and cooked forms. There are only two ways to get to this code path: 1) Via a user-defined transformation, in which case we would want to preserve what was written, or 2) via one of our code fixes. If we want to normalize TAB or FF in the code fix, that's fine, but I'm concerned whether this would affect anyone using custom transforms. This is a small enough corner case its probably reasonable however, and we can address it if there's feedback.

Not escaping a CRLF newline and instead having the template literal have a CRLF will be interpreted as a regular newline, and result in a string with just LF. (It's even more likely to get mangled by your editor, but even if not you'll get that problem.)

StringLiteral can't contain a raw CRLF unless its part of a LineContinuation, which becomes an empty code sequence, per NOTE 2 of https://tc39.es/ecma262/#sec-literals-string-literals and in https://tc39.es/ecma262/#sec-static-semantics-sv, so that wouldn't matter in a StringLiteral → TemplateLiteral conversion. The escape sequence \r\n in a string literal should remain as the escape sequence \r\n in the template literal, but we shouldn't just convert a raw CRLF into the escape sequence as that could impact custom transforms (which I could see taking a reliance on the CRLF->LF normalization of template literals).

So, -1 on converting CRLF into the escaped form, +1 on converting other characters (at least until someone raises an issue).

src/compiler/utilities.ts

elibarzilay · 2021-07-31T04:02:03Z

@rbuckton, I added a test. The only place I saw this code used is when printing types, so that's what the test is using. (And the use of CRLF is to stress that this was actually broken before this change.)

rbuckton

I'm not 100% clear on how the test added tests the changes to the RegExp, since that test relies on an existing source file, from which we would pull the source text and wouldn't use backtickQuoteEscapedCharsRegExp. You would need to add a unit test that exercises the NodeFactory API to create and print a node with custom text/rawText to prove out this change.

rbuckton · 2021-08-02T21:48:54Z

src/compiler/utilities.ts

-    // Template strings should be preserved as much as possible
-    const backtickQuoteEscapedCharsRegExp = /[\\`]/g;
+    // Template strings preserve simple LF newlines, still encode CRLF (or CR)
+    const backtickQuoteEscapedCharsRegExp = /\r\n|[\\\`\u0000-\u001f\t\v\f\b\r\u2028\u2029\u0085]/g;


Not escaping things like a TAB or a FF character can get mangled by editor style settings

Yeah, I can see that being a possible issue, however I'm still not sure "<TAB>foo" should become `\tfoo` since it is still semantically different between the raw and cooked forms. There are only two ways to get to this code path: 1) Via a user-defined transformation, in which case we would want to preserve what was written, or 2) via one of our code fixes. If we want to normalize TAB or FF in the code fix, that's fine, but I'm concerned whether this would affect anyone using custom transforms. This is a small enough corner case its probably reasonable however, and we can address it if there's feedback.

Not escaping a CRLF newline and instead having the template literal have a CRLF will be interpreted as a regular newline, and result in a string with just LF. (It's even more likely to get mangled by your editor, but even if not you'll get that problem.)

StringLiteral can't contain a raw CRLF unless its part of a LineContinuation, which becomes an empty code sequence, per NOTE 2 of https://tc39.es/ecma262/#sec-literals-string-literals and in https://tc39.es/ecma262/#sec-static-semantics-sv, so that wouldn't matter in a StringLiteral → TemplateLiteral conversion. The escape sequence \r\n in a string literal should remain as the escape sequence \r\n in the template literal, but we shouldn't just convert a raw CRLF into the escape sequence as that could impact custom transforms (which I could see taking a reliance on the CRLF->LF normalization of template literals).

So, -1 on converting CRLF into the escaped form, +1 on converting other characters (at least until someone raises an issue).

src/compiler/utilities.ts

elibarzilay · 2021-08-04T21:21:43Z

I think that there's some confusion here. This is no longer related to the string literal code fix, since I made it hand out a rawText form in the other PR. The only place that I saw that this code affects is the output of string literal types, and that's what the test uses. Also, in the context of the type, I thought that it was expected to not always have a source code for the type since these strings are shuffled in the checker--?

In any case, if you run the same test on main, you'll see two differences:

The .d.ts output
```
declare const f: (hdr: string, val: number) => `${string}:<TAB>${number}
`;
```
has a literal TAB character (regardless of the code using \t) which is the possibly bad behavior; and the \r\n in the code turns into a literal CRLF which leads to broken output. In contrast, with my change, you get \t and \r\n respectively.
The same thing happens in the error output from the file, which is not as important, but still confusing. (Plus, the fact that these strings get to error output further supports using escapes.)

If this is indeed the only way to exercise this code, then maybe escape plain LFs (as \ns) too? (Instead of the exception for \r\n)

1. `getRawLiteral()`: barf if `currentSourceFile` is missing, since if it is, then the following `getSourceTextOfNodeFromSourceFile` will return a bogus `""`. 2. One `||` -> `??` change. 3. `backtickQuoteEscapedCharsRegExp`: escape the usual control characters except for a simple LF. This code does get used to generate backtick strings when `rawText` is not given, and not escaping things like TAB characters can get mangled by editor settings. Worse, not escaping a CRLF and putting it verbatim in sthe string source will interpret it as LF, so add a special case for escaping these as `\r\n`. Added test. Related to microsoft#44313 and microsoft#40625.

rbuckton · 2021-08-04T22:18:35Z

Do we not normalize CRLF in template literal types? If so, that seems like a big disparity with template literal expressions (but perhaps unavoidable?).

rbuckton · 2021-08-04T22:20:07Z

[...] and the \r\n in the code turns into a literal CRLF which leads to broken output. [...]

How does this break the output? Do we not handle CRLF when parsing template literal types?

elibarzilay · 2021-08-04T22:30:11Z

Do we not normalize CRLF in template literal types?

I'm not sure what normalization you're talking about... The actual type has \r\n, and since there's no rawText for literals in types, then this code gets the CRLF and creates a (broken) rawText.

[...] and the \r\n in the code turns into a literal CRLF which leads to broken output. [...]

How does this break the output? Do we not handle CRLF when parsing template literal types?

The original type has \r\n, and the resulting dts line is:

declare const f: (hdr: string, val: number) => `${string}:	${number}
`;

so the CRLF from the type leads to a declaration with just a LF.

rbuckton

We probably need to add a breaking change notice for API consumers that only passing text to one of the factory methods to create template literals will have different escaping rules, and they can supply their own rawText if our escaping doesn't match what they need.

typescript-bot assigned elibarzilay Jul 19, 2021

typescript-bot added Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Jul 19, 2021

elibarzilay force-pushed the backticks-etc branch from a2252b7 to 1f968b1 Compare July 19, 2021 13:21

elibarzilay requested a review from rbuckton July 19, 2021 15:51

rbuckton suggested changes Jul 28, 2021

View reviewed changes

elibarzilay force-pushed the backticks-etc branch 2 times, most recently from cddfe23 to 5c089c3 Compare July 31, 2021 03:59

rbuckton suggested changes Aug 2, 2021

View reviewed changes

elibarzilay force-pushed the backticks-etc branch from 5c089c3 to 0e9b8f4 Compare August 4, 2021 21:23

rbuckton approved these changes Aug 4, 2021

View reviewed changes

elibarzilay merged commit 635f5bd into microsoft:main Aug 5, 2021

elibarzilay deleted the backticks-etc branch August 5, 2021 09:45

maxpatiiuk mentioned this pull request Jul 5, 2024

bug(Printer): LF newline in template strings is escaped when it shouldn't be #59150

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some tweaks for backtick strings #45099

Some tweaks for backtick strings #45099

Uh oh!

elibarzilay commented Jul 19, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

rbuckton Jul 28, 2021

Uh oh!

elibarzilay Jul 28, 2021

Uh oh!

rbuckton Aug 2, 2021

Uh oh!

Uh oh!

elibarzilay commented Jul 31, 2021

Uh oh!

rbuckton left a comment

Uh oh!

rbuckton Aug 2, 2021

Uh oh!

Uh oh!

elibarzilay commented Aug 4, 2021

Uh oh!

rbuckton commented Aug 4, 2021

Uh oh!

rbuckton commented Aug 4, 2021

Uh oh!

elibarzilay commented Aug 4, 2021

Uh oh!

rbuckton left a comment

Uh oh!

Uh oh!

Some tweaks for backtick strings #45099

Some tweaks for backtick strings #45099

Uh oh!

Conversation

elibarzilay commented Jul 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rbuckton Jul 28, 2021

Choose a reason for hiding this comment

Uh oh!

elibarzilay Jul 28, 2021

Choose a reason for hiding this comment

Uh oh!

rbuckton Aug 2, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elibarzilay commented Jul 31, 2021

Uh oh!

rbuckton left a comment

Choose a reason for hiding this comment

Uh oh!

rbuckton Aug 2, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elibarzilay commented Aug 4, 2021

Uh oh!

rbuckton commented Aug 4, 2021

Uh oh!

rbuckton commented Aug 4, 2021

Uh oh!

elibarzilay commented Aug 4, 2021

Uh oh!

rbuckton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elibarzilay commented Jul 19, 2021 •

edited

Loading