-
Notifications
You must be signed in to change notification settings - Fork 64
Improve writeCodePointValue KDoc #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -130,6 +130,9 @@ internal fun String.utf8Size(startIndex: Int = 0, endIndex: Int = length): Long | |||
* Without such a conversion, data written to a [Sink] can not be converted back | |||
* to a string from which a surrogate pair was retrieved. | |||
* | |||
* More specifically, all code points mapping to UTF-16 surrogates (`U+d800`..`U+dfff`) | |||
* will be written as `?` characters (`U+0063`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this JVM-only behavior or common one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, it's a common one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it inconsistent with String.encodeToByteArray
where the replacement is unspecified, but different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encodeToByteArray
call on a string with a low surrogate only results in a byte array containing ?
: https://pl.kotl.in/uFX3YULin
@@ -130,6 +130,9 @@ internal fun String.utf8Size(startIndex: Int = 0, endIndex: Int = length): Long | |||
* Without such a conversion, data written to a [Sink] can not be converted back | |||
* to a string from which a surrogate pair was retrieved. | |||
* | |||
* More specifically, all code points mapping to UTF-16 surrogates (`U+d800`..`U+dfff`) | |||
* will be written as `?` characters (`U+0063`). | |||
* | |||
* @param codePoint the codePoint to be written. | |||
* | |||
* @throws IllegalStateException when the sink is closed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to document that IllegalArgumentException can be thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up to #308