Description
Initial discussion: #7394
Initial approval: @CyrusNajmabadi
Speclet: https://github.com/dotnet/csharplang/blob/main/proposals/csharp-13.0/esc-escape-sequence.md
String/Character escape sequence \e
- #7401
- Prototype: Done
- Implementation: Add support for \e escape sequences in c# strings and characters roslyn#70497
- Specification: Not Started
Summary
An addition of the string/character escape sequence \e
as a shortcut/short-hand replacement for the character code point 0x1b
, commonly known as the ESCAPE
(or ESC
) character. This character is currently accessible using one of the following escape sequences:
\u001b
\U0000001b
\x1b
(not recommended, see the picture attached at the bottom.)
With the implementation of this proposal, the following assertions should be true:
char escape_char = '\e';
Assert.IsTrue(escape_char == (char)0x1b, "...");
Assert.IsTrue(escape_char == '\u001b', "...");
Assert.IsTrue(escape_char == '\U0000001b', "...");
Assert.IsTrue(escape_char == '\x1b', "...");
Motivation
Although the System.Console
class exposes quite a few possibilities to interact with the Terminal, it by far does not support every functionalities or features. Predominant among these are 24bit color support, bold, italic, underlined, or blinking text, as well as a few other features.
However, these can be emulated by printing (a series of) so-called VT100/ANSI escape codes to the System.Console.Out
stream (A reference of ANSI escape sequences can be found in the section Attachments and References of this proposal). Each VT100 escape sequence starts with the character 0x1b
(ASCII ESC
), followed by a series of characters, such as:
Console.WriteLine("This is a regular text");
Console.WriteLine("\u001b[1mThis is a bold text\u001b[0m");
Console.WriteLine("\u001b[2mThis is a dimmed text\u001b[0m");
Console.WriteLine("\u001b[3mThis is an italic text\u001b[0m");
Console.WriteLine("\u001b[4mThis is an underlined text\u001b[0m");
Console.WriteLine("\u001b[5mThis is a blinking text\u001b[0m");
Console.WriteLine("\u001b[6mThis is a fast blinking text\u001b[0m");
Console.WriteLine("\u001b[7mThis is an inverted text\u001b[0m");
Console.WriteLine("\u001b[8mThis is a hidden text\u001b[0m");
Console.WriteLine("\u001b[9mThis is a crossed-out text\u001b[0m");
Console.WriteLine("\u001b[21mThis is a double-underlined text\u001b[0m");
Console.WriteLine("\u001b[38;2;255;0;0mThis is a red text\u001b[0m");
Console.WriteLine("\u001b[48;2;0;255;0mThis is a green background\u001b[0m");
Console.WriteLine("\u001b[38;2;0;0;255;48;2;255;255;0mThis is a blue text with a yellow background\u001b[0m");
which result in the following output in wt.exe
and cmd.exe
:
Due to the recurring usage \u001b
, a shorter abbreviation such as \e
would be welcome. This is comparable to how \n
can be used as an abbreviation for \u000a
.
A further motivation for this proposal is the recurrent usage of the sequence \u001b
inside of ESC/POS commands when interacting with (thermal) printers, as e.g. referenced in the following documents and articles (thanks @jnm2 !):
- https://learn.microsoft.com/en-us/windows/uwp/devices-sensors/epson-esc-pos-with-formatting
- https://escpos.readthedocs.io/en/latest/commands.html
- https://reference.epson-biz.com/modules/ref_escpos/index.php?content_id=2
Detailed design
I propose the language syntax specification to be changed as follows in section 6.4.5.5:
fragment Simple_Escape_Sequence
- : '\\\'' | '\\"' | '\\\\' | '\\0' | '\\a' | '\\b' | '\\f' | '\\n' | '\\r' | '\\t' | '\\v'
+ : '\\\'' | '\\"' | '\\\\' | '\\0' | '\\a' | '\\b' | '\\f' | '\\n' | '\\r' | '\\t' | '\\v' | '\\e'
;
As well as the addition of the last line to the following table in the specifications:
A simple escape sequence represents a Unicode character, as described in the table below.
Escape sequence Character name Unicode code point \'
Single quote U+0027 ... ... ... \e
Escape character U+001B The type of a Character_Literal is
char
.
Drawbacks
Every new language feature request brings added complexity to the compiler. However, I shall argue that the implementation of this specific feature mainly involves variation of existing code concerning the parsing of existing escape sequences such as \v
, \f
, or \a
. Furthermore, a certain complexity involves the adaptation of Roslyn's unit tests to accommodate this feature.
Alternatives
The usage of the escape character 0x1b
can be implemented using traditional methods, amongst which are:
\u001b
\U0000001b
\x1b
(not recommended, see the picture attached at the bottom.)$(char)0x1b
- The assignment of
(char)0x1b
to a constant variable/field and its usage inside of interpolated strings instead of the direct utilization of the proposed sequence\e
.
Unresolved questions
- Possible impact on existing code analyzers and existing test cases inside the compiler or .NET runtime.
- Should C# come with a new analyzer, which suggests the refactoring of existing occurrences of
\x1b
,\u001b
, etc. towards\e
? - Impact on other .NET languages? This is not relevant for VB.NET, as all strings are de-facto verbatim (except for the escapement of double quotes). But what about F#? Should this escape sequence also be proposed there?
Design meetings
Attachments and References
- https://en.wikipedia.org/wiki/Control_character
- https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/#string-escape-sequences
- https://escpos.readthedocs.io/en/latest/commands.html
- https://learn.microsoft.com/en-us/windows/uwp/devices-sensors/epson-esc-pos-with-formatting
- https://en.wikipedia.org/wiki/ANSI_escape_code
- As referenced by in the section Summary, the following is the warning given on https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/ concerning the usage of the shorter escape sequence
\x1b
instead of\u001b
: