-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Utf16.IsValid and Utf8/16.IndexOfInvalidSubsequence #120326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds new UTF-16 validation APIs to System.Text.Unicode, introducing Utf16.IsValid
and Utf16.IndexOfInvalidSubsequence
methods alongside similar functionality for UTF-8. It also refactors existing code to use these new standardized validation APIs instead of custom validation logic.
- Adds new
Utf16
class with validation methodsIsValid
andIndexOfInvalidSubsequence
- Adds
Utf8.IndexOfInvalidSubsequence
method to complement existingUtf8.IsValid
- Refactors multiple codebases to use the new standardized validation APIs
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
System.Private.CoreLib.Shared.projitems | Adds the new Utf16.cs file to the build |
Utf16.cs | New class implementing UTF-16 validation methods |
Utf8.cs | Adds IndexOfInvalidSubsequence method and updates class documentation |
System.Runtime.cs | Adds public API surface for new UTF-16 and UTF-8 validation methods |
HttpEncoder.cs | Refactors surrogate validation to use new Utf16.IndexOfInvalidSubsequence |
Normalization.Icu.cs | Replaces custom validation logic with Utf16.IsValid |
StringSearchValues.cs | Simplifies surrogate validation using Utf16.IsValid |
Utf8UtilityTests.ValidateBytes.cs | Adds test coverage for new Utf8.IndexOfInvalidSubsequence method |
Utf16UtilityTests.ValidateChars.cs | Adds test coverage for new Utf16 validation methods |
{ | ||
/// <summary> | ||
/// Provides static methods that convert chunked data between UTF-8 and UTF-16 encodings. | ||
/// Provides static methods that convert chunked data between UTF-8 and UTF-16 encodings, and methods that validates UTF-8 sequences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar error: 'methods that validates' should be 'methods that validate'.
/// Provides static methods that convert chunked data between UTF-8 and UTF-16 encodings, and methods that validates UTF-8 sequences. | |
/// Provides static methods that convert chunked data between UTF-8 and UTF-16 encodings, and methods that validate UTF-8 sequences. |
Copilot uses AI. Check for mistakes.
namespace System.Text.Unicode | ||
{ | ||
/// <summary> | ||
/// Provides static methods that validates UTF-16 strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar error: 'methods that validates' should be 'methods that validate'.
/// Provides static methods that validates UTF-16 strings. | |
/// Provides static methods that validate UTF-16 strings. |
Copilot uses AI. Check for mistakes.
Fixes #118018
Fixes #118113