-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[RFC] __id field for unique identifiers #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -6,3 +6,4 @@ build | |||
out | |||
node_modules | |||
npm-debug.log | |||
.vscode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add this to your global gitignore ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call!
This is looking good - thanks for coalescing the RFC. In my mind the biggest question is how I'm not sure that expressing ids as nullable is what we want, for example if I fetched a bunch of |
Another piece of feedback is just at a high level - why should we be including this in the spec with additional behavior rather than simply making an Specifically I think the argument "client caches would like to rely on this" is a little faulty as we already have client caches that rely on different forms of pagination and connection models with those as best practices and no impact on the spec. I know we've talked before about changes to the GraphQL execution engine itself before that needed some way to identify objects that perhaps we should allude to in this RFC. For example, one of the reasons that led me to think that |
I recall seeing that the |
More generally, Practically, using a non prefixed field is only reasonable when suggesting common practice, and not appropriate for use by the GraphQL spec itself, since perhaps clients already have a field called In a final decision where this should be a common practice, then |
In my opinion the specification of Being able to identify data is important, so then the next question is how we identify that data. Whichever approach we choose, it must be consistent. There are problems with So in summary, this issue (in my opinion) is the most important issue to resolve for the long term success of GraphQL, and since there are problems with the community solution (Relay |
Also, just a thought, if instead of having {
__id # Encoded value of an empty path or array: `[]`.
a {
__id # User defined to be the value `unique-id`.
b {
__id # Encoded path `["unique-id", "b"]`.
c(x: true) {
__id # Encoded path `["unique-id", "b", "c"]`
alias: d {
__id # Encoded path `["unique-id", "b", "c", "alias"]`
}
}
}
}
} I don’t particularly like this idea, but @leebyron mentioned requiring non-nullable ids by default 😉. Ultimately I don’t think missing an |
@calebmer coincidentally this is very similar to how the Apollo Client cache deals with missing IDs.
This is pretty inconvenient if GraphQL is meant as a thin API layer, since people often have underlying model objects that already have an
There's one big argument for this being in the spec, which is similar to @calebmer's - this gives GraphQL tools, services, and libraries the ability to reliably cache data coming from any GraphQL server. I'd argue that is a much more fundamental requirement than pagination. Given that, having the In the current situation:
This means that people getting started with GraphQL can't get the benefits of caching without a lot of effort, and makes GraphQL seem less powerful out of the box. If this proposal or similar could be included in the spec, then every single GraphQL tool could have caching built in by default. |
It's important to separate aspects of It's already possible to write a GraphQL client that correctly caches data coming from any server. This can be achieved with product-specific configuration (e.g. a mapping of type names to the identifying field(s) to use) or with convention (as in Relay's "Object Identification" spec). Given that, As @leebyron pointed out, though, other proposals may require something like
On the one hand, I do see a lot of value in |
That's not the main reason I'm interested in this proposal. I mostly just think it's unfortunate that such a core feature of pretty much all GraphQL servers and clients is unspecified and requires extra configuration, which makes it harder to build generic tools. But let's take a look: deferI think this can easily be implemented without caching IDs. The new results can simply have a JSON path, in the same format as the error path in #230, which will allow the client to put the new result in the right place. Of course, that requires maintaining the original query result until the rest of the query results arrive, but I think that's preferable to requiring caching IDs for any deferred fields. Normalized responsesFrom my understanding of how Relay and Apollo work, a normalized transport wouldn't help either one. Both clients have different opinions about how the normalization should be done, and so neither would use the graphql spec normalization directly. Perhaps a normalized transport would make the response size more efficient, but in that case it might be able to use something like JSON paths? Although I agree it would be more compelling to have it use a standard ID, especially since it would only make sense to normalize repeated objects which would need to be identified. Delta payloads for livequeriesI think these and In my mind, this is a case of taking something that is very clearly a "best practice" in GraphQL, that almost any serious tool needs to take advantage of, and making it "standard". I think almost anything can be done without a standardized |
Just wanted to add my 5 cents into this discussion. In general, I don't have anything against a built-in concept of I also thinking about implications on the end-users who are writing a GraphQL server and possible complexity of GraphQL server implementation itself. For instance, it would be very inconvenient if I need to explicitly define an ID for every single GraphQL object type in the schema (especially if they should be globally unique). I doubt that it would be even possible in most cases. I also not very optimistic about generically inferring constructed IDs based on some real object IDs. For instance, give this schema: type User {
id: String
profilePicture(size: Int!): Image
}
type Image {
width: Int!
height: Int!
url: String!
labels: [ImageLabel!]
}
type ImageLabel {
name: String!
description: String
} and this query: query AwesomePictures {
user(id: "123") {
id
awesomePicture: profilePicture(size: 300) {
awesomeLabels: labels {
description
}
}
}
} According to the semantics of my data, only What would be the ID of the labels in this case? Something like: Maybe instead of defining an interface __Identifiable {
id: ID!
} With this interface user can very precisely identify which objects do have an identifier and which don't. I think this will have a huge advantage not only for client-side caching, but also for normalized data transfer. I'm thinking about JSON Graph here. The problem is that we can't really unify particular object within different parts of the query because the same field name may have very different semantic value or even type (if alias is used). But we have the similar situation with fragment unification. It is solved with |
It looks there's a consensus that having an In my mind the biggest benefit would come from allowing Without that property, I think this is mostly indistinguishable from a client-specific convention to use a special interface like So there could be a different proposal that would:
In my original proposal they would simply be returned as |
Now that I think about it, given this built-in |
The availability of At a minimum we should standardize the semantics of an |
Per @stubailo and @calebmer's comments above, I'd propose that we separate this into two proposals: defining the semantics of the
|
Does the shortened proposal basically come down to:
Or is there anything else? And yes, I think there could be a general discussion to be had about the benefits of schema-less GraphQL client libraries and tools, and some simple changes that could make them much more convenient to build and use. The benefit here is simplifying adoption of GraphQL as the next standard for APIs - such tools will inherently be easier to adopt than those which require special tooling or schema access, the same way that untyped languages are easier to get started with than typed ones, and generally require less tooling. Where would be the best place to have that discussion? |
@stubailo That's a good summary, yup!
How about opening another issue for this topic in order to start the discussion and get more input? It would be good to clearly define the goal first in order to ground the discussion, and then list out any technical challenges that make that goal harder to achieve under the current spec. I think we're one or two steps away from a RFC, as there may be better alternatives than just allowing |
Great conversation so far. Let me try to round up where we are and some decisions that need to be clarified and made:
|
Here's a counter-proposal to consider: we could standardize a clarified version of the current best practice. Constrain
|
I'm not sure this is a factor for I think taking up the name |
Ya, to be clear the "Easy to auto-add to queries for schemaless clients." has nothing to do with |
Yeah - it certainly can require people to work around this which is a disadvantage for moving quickly, however perhaps it's an advantage for the design quality of the resulting APIs? Directly exposing SQL IDs is a "code smell" since it leaks implementation details about the API. (we had similar issues at Facebook - to interop with old APIs we exposed I could also argue that the "pit of success" would have us make the Another thing to consider is the case where objects are both uniquely cacheable AND are refetchable. It is probably confusing to say that you should use the I guess another way to rephrase this concern is that clear API design would lead us to use only Should we actually allow for this kind of confusing scenario? |
Deciding the GraphQL specification is going to favor one path or “put of success” over another, the specification begins to assume a certain use case which can be dangerous. Naming the field All of these applications may be able to make use of identifying the data in their responses, but not all may appreciate the naming collision. Alternative proposal:After better understanding some of the hesitations around the It seems as if the name of the unique cache-key field should really be a schema designer’s decision. We can (hopefully) agree that by specifying type Person {
_id: ID! @identifier
name: String!
}
interface Node {
id: ID! @identifier
}
type Post implements Node {
id: ID! @identifier
title: String!
} Or: type Person @identifiable(field: "_id") {
_id: ID!
name: String!
}
interface Node @identifiable(field: "id") {
id: ID!
}
type Post implements Node @identifiable(interface: "Node") {
id: ID!
title: String!
} A field on a type or interface may be marked as the type’s identifier field. That field must have a type of What the In clients, the cache key would become the In the introspection query if we wanted to get real meta, the Pros:
Cons:
|
@stubailo - could you please explain this a bit more in detail?
I don't mean to derail this conversation, but I'd really like to understand where you were heading with this, because if it is what I think it is, you get a 👍 from me. However, this is my confusion. If it is what I think you were getting at, then any definition of a fixed field type, even an ID, which is so common, basically kills my understanding, thus my asking for a bit of an explanation. 😄 I guess this is my question. Isn't the notion of "variable schema" (I don't like the term schema-less, as it is inappropriate) and the necessity of an ID field contradictory? 😄 Scott |
I'm going to close this PR since there doesn't seem to be a clear consensus about the design of such an addition to the spec, or whether it's necessary at all. I am partial to Lee's idea in #232 (comment) about constraining any fields named I still deeply believe that standardizing the concept of an identifier for objects will make GraphQL more useful overall and reduce the amount of boilerplate and configuration most developers need to write, so if there is an agreement on how that can happen I'd be happy to write a PR for the spec. |
Thanks for your work so far on this. I'm still interested in a solution for broad identifiability. Hopefully we can find consensus on this in a future RFC |
Was there any kind of follow-up on this closed proposal? |
This brings together many of the considerations about a
__id
unique identifier field into one document. There is still a lot of work to do before this becomes a merge-able diff on the spec:__id
can live next to__typename