clarify that utf-8 is just a possible encoding of strings#684
Conversation
|
|
||
| **Result Coercion** | ||
|
|
||
| Fields returning the type {String} expect to encounter UTF-8 string internal values. |
There was a problem hiding this comment.
I think this part should continue to specify UTF-8? As I read it, it's about the serialization of strings in responses, where specifying an encoding is actually appropriate?
There was a problem hiding this comment.
This section is about result coercion not serialization. Of course String can be serialized to UTF-8 (and often they are via UTF-8 JSON) but it doesn't have to be.
|
@andimarek Maybe I'm missing something but the discussion was about extending the range of possible code points not removing UTF8 from the spec? If you need to send string in some other encoding you can always create custom scalar for that and with |
|
@IvanGoncharov this is just a cleanup/correction. As discussed today the current section mentioning UTF-8 is just wrong: UTF-8 is one of the possible Unicode encodings. Strings are sequences of unicode code points, not UTF-8 Strings. In fact the reference implementation itself uses UTF-16 to represent Strings (because JS uses UTF-16 internally to encode Unicode). Also: sending data over the wire (serialization) is different from Scalar Coercion. The most commonly used serialization format is JSON which again is normally always encoded in UTF-8. We have an extra section how to serialize to JSON. But this is in noway required: JSON UTF-8 encoded serialization is just an option. |
|
@andimarek I made some edits, let me know if these look good to you |
|
I'm going to merge this now since this is the other half of the change made in #854 |
this change tries to clarify that String scalars are not always UTF-8 strings, but actually sequences of unicode code points, which could be UTF-8, but doesn't have to.