-
Couldn't load subscription status.
- Fork 10.6k
[DRAFT][stdlib] String.Index: Add custom printing #58479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
@swift-ci test |
Azoy
approved these changes
Apr 28, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look good to me! Awaiting what evolution says about the format...
kastiglione
added a commit
to swiftlang/llvm-project
that referenced
this pull request
Oct 28, 2022
Implement a type summary for Swift's `String.Index`. The summary string follows the following: 1. Original proposal: https://forums.swift.org/t/improving-string-index-s-printed-descriptions/57027 2. Proposed implementation: swiftlang/swift#58479 3. Temporary(ish) near-`CustomStringConvertible` implementation: swiftlang/swift#61548 The associated test cases are taken from the test cases in swiftlang/swift#58479. rdar://99211823
kastiglione
added a commit
to swiftlang/llvm-project
that referenced
this pull request
Nov 2, 2022
Implement a type summary for Swift's `String.Index`. The summary string follows the following: 1. Original proposal: https://forums.swift.org/t/improving-string-index-s-printed-descriptions/57027 2. Proposed implementation: swiftlang/swift#58479 3. Temporary(ish) near-`CustomStringConvertible` implementation: swiftlang/swift#61548 The associated test cases are taken from the test cases in swiftlang/swift#58479. rdar://99211823 (cherry picked from commit c7146a3)
kastiglione
added a commit
to swiftlang/llvm-project
that referenced
this pull request
Feb 9, 2023
Implement a type summary for Swift's `String.Index`. The summary string follows the following: 1. Original proposal: https://forums.swift.org/t/improving-string-index-s-printed-descriptions/57027 2. Proposed implementation: swiftlang/swift#58479 3. Temporary(ish) near-`CustomStringConvertible` implementation: swiftlang/swift#61548 The associated test cases are taken from the test cases in swiftlang/swift#58479. rdar://99211823 (cherry picked from commit c7146a3)
kastiglione
added a commit
to swiftlang/llvm-project
that referenced
this pull request
Feb 9, 2023
Implement a type summary for Swift's `String.Index`. The summary string follows the following: 1. Original proposal: https://forums.swift.org/t/improving-string-index-s-printed-descriptions/57027 2. Proposed implementation: swiftlang/swift#58479 3. Temporary(ish) near-`CustomStringConvertible` implementation: swiftlang/swift#61548 The associated test cases are taken from the test cases in swiftlang/swift#58479. rdar://99211823 (cherry picked from commit c7146a3)
|
Closing in favor of #75433. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
standard library
Area: Standard library umbrella
swift evolution pending discussion
Flag → feature: A feature that has a Swift evolution proposal currently in review
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[This changes the public API of the stdlib, so it will likely need to go through the Swift Evolution process before landing.]
Forum pitch: https://forums.swift.org/t/improving-string-index-s-printed-descriptions/57027
This PR conforms
String.IndextoCustomStringConvertibleandCustomDebugStringConvertible, making it easier to understand what these indices actually are, and considerably simplifying the debugging experience when working with string indices.Having String indices print nicer will be particularly helpful while working with the new string processing algorithms.
For reference, in Swift 5.6,
String.Indexuses an unhelpful, mirror-based description that is not at all human-readable, not even for the humans working on the implementation ofStringin the stdlib:String indices are simply offsets from the start of the string's underlying storage representation, referencing a particular UTF-8 or UTF-16 code unit, depending on the string's encoding. Most Swift strings are UTF-8 encoded, but strings bridged over from Objective-C may remain in their original UTF-16 encoded form.
For
CustomStringConvertible, the index description displays the storage offset value and its encoding:Note how the start index does not care about its storage encoding -- offset zero is the same location in either case.
String index ranges print in a compact, easily understandable form:
Exposing the actual storage offsets in the description effectively demonstrates how indices work, helping people gain a better understanding of both the underlying Unicode concepts, and the details of their implementation in Swift.
For example, successive
Stringindices tend to skip code units in irregular ways, reflecting the size of the underlying grapheme clusters.Note how the initial emoji takes up 8 code units, followed by a 1-unit ASCII space, and ending with a series of Cyrillic characters that take two UTF-8 code units each.
Looking at indices in the Unicode scalars view shows how the emoji breaks into two separate code points (U+1F44B and U+1F3FC, at offsets 13 and 17, respectively):
This is a native UTF-8 string, so indices in the UTF-8 view are rather boring -- they simply count offsets from 0 up to 21:
UTF-16 indices are rather more interesting, as the string starts with two Unicode scalars outside the BMP. In UTF-16, these are encoded as surrogate pairs, which aren't directly present in this string's storage. To manage this, the index values for the trailing surrogates include a transcoded offset value (the
+1in the printout below), to help identify which code unit is addressed by the index:The
CustomDebugStringConvertibleoutput is a bit more verbose. In addition to the offset + encoding, it also includes the information that is maintained in the bits of the index that are reserved for performance flags and other auxiliary data.For example, index
ibelow is addressing the UTF-8 code unit at offset 10 in some string, which happens to be the first code unit in aCharacter(i.e., an extended grapheme cluster) of length 8:rdar://