-
Notifications
You must be signed in to change notification settings - Fork 1.1k
clarify that utf-8 is just a possible encoding of strings #684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
human-readable text. All response formats must support string representations, | ||
and that representation must be used here. | ||
|
||
**Result Coercion** | ||
|
||
Fields returning the type {String} expect to encounter UTF-8 string internal values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this part should continue to specify UTF-8? As I read it, it's about the serialization of strings in responses, where specifying an encoding is actually appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is about result coercion not serialization. Of course String can be serialized to UTF-8 (and often they are via UTF-8 JSON) but it doesn't have to be.
@andimarek Maybe I'm missing something but the discussion was about extending the range of possible code points not removing UTF8 from the spec? If you need to send string in some other encoding you can always create custom scalar for that and with |
@IvanGoncharov this is just a cleanup/correction. As discussed today the current section mentioning UTF-8 is just wrong: UTF-8 is one of the possible Unicode encodings. Strings are sequences of unicode code points, not UTF-8 Strings. In fact the reference implementation itself uses UTF-16 to represent Strings (because JS uses UTF-16 internally to encode Unicode). Also: sending data over the wire (serialization) is different from Scalar Coercion. The most commonly used serialization format is JSON which again is normally always encoded in UTF-8. We have an extra section how to serialize to JSON. But this is in noway required: JSON UTF-8 encoded serialization is just an option. |
@andimarek I made some edits, let me know if these look good to you |
I'm going to merge this now since this is the other half of the change made in #854 |
this change tries to clarify that String scalars are not always UTF-8 strings, but actually sequences of unicode code points, which could be UTF-8, but doesn't have to.