Support a built-in type for well-formed strings

### 🔍 Search Terms

"Unicode", "well-formed Unicode", "valid Unicode", "lone surrogates", ""UTF-16", "UTF-8", "isWellFormed()", "toWellFormed()"

### ✅ Viability Checklist

- [x] This wouldn't be a breaking change in existing TypeScript/JavaScript code
- [x] This wouldn't change the runtime behavior of existing JavaScript code
- [x] This could be implemented without emitting different JS based on the types of the expressions
- [x] This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
- [x] This isn't a request to add a new utility type: https://github.com/microsoft/TypeScript/wiki/No-New-Utility-Types
- [?] This feature would agree with the rest of our Design Goals: https://github.com/Microsoft/TypeScript/wiki/TypeScript-Design-Goals

### ⭐ Suggestion

ES2024 now has [`String.isWellFormed()` and `String.toWellFormed()`][usv-string], which [are supported][es2024-string] in TypeScript's ES2024 type definitions.

But significant value from these functions is not realized in TypeScript because of the lack of a well-formed string type.

What I'd like to see is a "well-formed string" type (itself a super-type of `String`) for which `isWellFormed()` serves as a type guard and `toWellFormed()` (as well as functions like `TextDecoder.decode()`) return the well-formed string type.

Additionally, string literals could be determined to be of the well-formed string type at compile time.

This way TypeScript developers could get type safety for scenarios where strings need to be guaranteed to be well-formed.

[usv-string]: https://github.com/tc39/proposal-is-usv-string
[es2024-string]: https://github.com/microsoft/TypeScript/blob/main/src/lib/es2024.string.d.ts

### 📃 Motivating Example

I'm working on a TypeScript implementation of [CEL](https://cel.dev) which requires passing well-formed UTF-8 strings into an evaluation environment. If I want to bridge TypeScript's type safety to CEL's type safety, I'll need a well-formed string type in TypeScript.

### 💻 Use Cases

I can do something like this in my project:

```ts
interface WellFormedString extends String {
  __brand: "WellFormed";
}

interface String {
  isWellFormed(): this is WellFormedString;
  toWellFormed(): WellFormedString;
  toUpperCase(): this extends WellFormedString ? WellFormedString : string;
  toLowerCase(): this extends WellFormedString ? WellFormedString : string;
}

interface TextDecoder {
  decode(input?: AllowSharedBufferSource, options?: TextDecodeOptions): WellFormedString;
}

function useWellFormedString(a: WellFormedString) {
  // ...
}

// good -- no error
useWellFormedString("hello".toWellFormed());

// good -- no error
useWellFormedString("hello".toWellFormed().toUpperCase());

// good -- no error
const h = "hello";
if (h.isWellFormed()) {
  useWellFormedString(h); 
}

// good -- no error
// (the decoder coerces a lone "WTF-8" surrogate to "\ufffd\ufffd\ufffd")
useWellFormedString(new TextDecoder().decode(new Uint8Array([0xed, 0xba, 0xad])))

// good -- error
// (malformed string with lone UTF-16 surrogate)
useWellFormedString("\udead");

// bad -- error
useWellFormedString("hello");

// bad -- error
useWellFormedString("hello" as WellFormedString);
```

But there are some significant disadvantages here:
1. Well-formed string literals are not recognized as well-formed.
2. Uses a branding hack.
3. The compiler complains about casting (maybe this is fixable, but I don't know how).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support a built-in type for well-formed strings #60765

🔍 Search Terms

✅ Viability Checklist

⭐ Suggestion

📃 Motivating Example

💻 Use Cases

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support a built-in type for well-formed strings #60765

Description

🔍 Search Terms

✅ Viability Checklist

⭐ Suggestion

📃 Motivating Example

💻 Use Cases

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions