Skip to content

Latest commit

 

History

History
161 lines (146 loc) · 18.2 KB

2022-02-03.md

File metadata and controls

161 lines (146 loc) · 18.2 KB

GraphQL WG Notes - February 2022

Watch the replay: GraphQL Working Group Meetings on YouTube

Agenda

  1. Agree to Membership Agreement, Participation & Contribution Guidelines and Code of Conduct (1m, Lee)
  2. Introduction of attendees (5m, Lee)
  3. Determine volunteers for note taking (1m, Lee)
  4. Review agenda (2m, Lee)
  5. Review previous meeting's action items (5m, Lee)
  6. Deprecation of Inputs => Stage 3 (5m, Stephen)
  7. Update on @defer & @stream (20m, Rob Richard)
  8. Allowing recursive references in fragments.

Determine volunteers for note taking (1m, Lee)

  • Benjie
  • Stephen

Review previous meeting's action items (1m, Lee)

Deprecation of Inputs => Stage 3 (5m, Stephen)

  • PR #805
  • Stephen: RFC is currently stage 2. Aim to get it to Stage 3. Context: @deprecated directive, could not be applied on fields/directives; this PR adds that functionality. One point of debate: if an arg is required should it be possible to mark it as deprecated? It's not a good idea to mark it as deprecated, but Shopify had a potential use case for it. In GraphQL.js/Java it's been implemented with a validation that prevents deprecating required inputs for over a year now and hasn’t had pushback. I’ve written it in the spec as a MUST. Implementation is ready to go.
  • Lee: the only thing blocking it going to stage 3 is editorial review; I'll preemptively make it stage 3 to force myself to do that ;)
  • <General agreement>
  • First accepted RFC of 2022 🎉

Update on @defer & @stream (20m, Rob Richard)

  • Enforcing delivery order of payloads
  • Non-nullable field errors
  • Rob: two issues:
    • Enforcing delivery order of payloads. discussion graphql/defer-stream-wg#17
      • Agreement: should be in order, following a tree hierarchical approach. Specifics are now laid out.
      • Whenever the execution algo hits a @defer/@stream it references the datastructure which represents a "promise", which then gets passed through all the functions (ExecuteSelectionSet, CollectFields, ExecuteField, CompleteValue). If the parent structure is passed in, then it must wait for that to complete before it itself can resolve. A bit like .then(...) in JS. This prevents payloads in defer resolving before their parent resolves, and in stream no higher number index resolving before a lower numbered index.
      • A potentially confusing case is (see Example B) where a fragment spread is inside another fragment spread; they're seen as siblings because collect fields works recursively, so they won't be held up by each other.
      • Lee: paraphrasing: if you mark a fragment spread as deferred, and that has a fragment spread that's deferred, then those things will be deferred at the same level.
      • Lee: this feels right. It's complex to express it technically but it's quite obvious conceptually.
      • Rob: this feels like a conservative way to move forward.
      • Lee: do you have concerns - are there tradeoffs at play? Is it obviously the right thing to do?
      • Rob: this feels like the most conservative approach. There may be scenarios where we may be trading performance for making client digestion of payloads easier, but we could potentially look at that later.
    • Non-nullable fields in defer/stream payloads
      • Spec says non-nullable containing null should bubble to first nullable field. What if the nullable field was already sent?
      • If a deferred fragment defers all the way up to null, are we actually breaking any of GraphQL's guarantees? It feels like the fragment being skipped is not unexpected. Stream is more complex - maybe we should add a validation rule to prevent stream on non-nullable lists (and address it with client-controlled nullability).
      • Michael: since we've broken up the request we can't do the nullable handling. Feels like the query batching with @export directive.
      • Lee: feels like there's an equivalence between defer/stream and assembling independent queries - those queries would independently error at the top boundary. Separate incremental payloads in the stream - now it's client responsibility to determine how to do the error handling when reifying the final list. It might be okay with a partial list, or might blow up the entire streamed end result, or do something cleverer, e.g. determining where the null occurred and re-triggering the null handling behavior locally.
      • Matt: I'd lean towards @stream and @defer behaving the same way - it's the client's responsibility.
      • Benjie: we also cannot add that validation rule because it would be a conflict - we can upgrade a nullable field to non-nullable currently; but if we do not allow @stream on non-nullable fields then we can no longer do this because a previously valid query with @stream on a nullable field would become invalid.
      • <Agreement>
      • Rob: I was hoping putting the validation on would allow us to kick the can down the road, but Benjie's point is definitely a problem. So in conclusion we should treat defer/stream as null boundaries.
      • Lee: is there any missing information here about how to handle that error?
        1. we could decide whether/not to include the data field at all... actually no it should be null like for queries
        1. what should we do with the error - is it up to the client to know enough about the schema to figure out if it can naively merge the data into the path, or do we need to indicate this is not safe to merge. Naive client receiving an incremental response would want to stick the data into the given path repeatedly.
      • Michael: via an error code?
      • Rob: we could do that, but it would have to generate that the fields type was nullable.
      • Lee: the example on screen [19:40] says the value is null, the naive client would write that to its store, and could lead to a null pointer exception. Maybe the solution is to omit the "data" field since the value "null" is not correct here - it's not appropriate. NOTE: for a query the Query is always an object, so setting null instead of an object is obvious. However, data for incremental payloads is NOT always an object - it could be a scalar or something else. So omitting it might make sense.
      • Rob: omitting it would make sense and for defer if we omit it
      • Benjie <chat>:
        {a: Int!, b: Int!, c: Int!} with {a: 1, b: null, c: 3} would result in null or omit for the entire deferred fragment ... @defer {a b c}; so when generating types all these fields need to be treated as nullable
      • Matt & Lee: what happens if hasNext is true and the stream closes - what does that mean? We should have something in the spec text that says how to interpret/handle that - e.g. as if the final payload errored fatally.
      • Rob: we allow {hasNext: false} on its own (no data) to indicate the end of a iterable if you can't know in advance that it would have terminated.
      • Yaacov: if we were to use lists we could give a startPath/endPath and then pass the values through in larger batches.
      • Lee/Benjie: not all server-side streaming primitives operate the same - AsyncIterator might be on a unit by unit basis, but you can also do batches of arrays (e.g. pulling 50 records from a table at a time); query execution performance has real ramifications. Each one evaluates the full selection set depth which cold turn resolving parallel execution into serial execution. It's also expensive in terms of describing the data since every item has to be wrapped in stream metadata.
      • One option would be to trim the index off the end of the path, and then concat automatically; but if we were to do that then it would preclude us doing out of order streams.
      • Matt: having lists in the streamed response feels very similar to streamed connections.
      • Lee: We might not support batched streams and parallel streams on day one, but allowing space to add that to the API would be wise.
      • Michael: paths could still point to a specific item - an insert to maintain ordering.
      • Michael: is there consensus that the data should be an array for stream?
      • Lee: I'm not sure there's consensus there - we should open a discussion regarding batch/parallel/error bubbling.
      • Yaacov: parallel === out of order?
      • Lee: yes: sets/lists are represented the same way, for sets the order isn't important. Streaming a list is fine. Streaming a set, the ordering is an unnecessary constraint.
      • Lee: it feels like we should make the data an array, and take the index out of the path. Then for both defer/stream the behaviour is "merge"; for defer it's object merge, for stream it's array merge.
      • Michael: do we even need index?
      • Matt: the index has value for smart clients - if you're merging into a graph, when do you start overwriting - queries coming in at the same time updating the same path.
      • Lee: also error message paths, etc.
      • Benjie: there's issues with list length - if an error occurs and you can't add it, now the list is the wrong length. I feel like defer/stream make things nullable implicitly. There's also a difference between fields being null/omitted
      • Lee: if you get an incremental payload and the data say null, the most naive behavior should be to just ignore everything and blow up - something went wrong. A more intelligent client might do a crawl.
      • Lee: its critical that the data is treated not as "null" but as "currently loading"/etc.
      • <Discussion around errors affecting other things that would have succeeded but maybe shouldn't now>
      • Benjie <chat>: { list @stream { ... @defer {a b c} error } }
      • Stephen: should we specify that stream/defer create a boundary and nulls shouldn't bubble past them - otherwise it's like Schrodinger's null - we don't know if the data that we've already received is valid or not until we get that deferred payload, and blowing it away again feels wrong.
      • Matt: if the request goes to completion then the result on the client in a deferred world ought to be the same as what you would have had had you had no @defer at all.
      • Benjie: it feels like @stream/@defer have effectively a @skip directive but you don't know if it's true or false - the fields may or may not be returned.
      • Rob: that's how I see it.
      • Roman: should we leave it to the client?
      • Lee: that would be my preference, it feels like we should add a non-normative note for this.
      • Rob: changing the stream payload is a pretty big change; does it make sense to separate stream and move on with defer?
      • Lee: it might. Keep them rolling in parallel. If everyone's confident that defer can be shipped but stream has open questions we could do that; but you don't need to make that separation now.
      • Rob: next steps
      • discussion about changing stream data to an array and removing index from payload - more flexibility for batching.
  • [20:27]

Allowing recursive references in fragments.

  • Recursive refs in fragments #929
  • GraphiQL uses TypeRef fragment - very verbose, and make assumption of maximum complexity, but this could be solved with a recursive fragment.
  • Similar if you wanted to look at the chain of responsibility for someone.
  • Spec says that fragments should never form loops because they could result in infinite recursion. This isn't true, though you can hit an infinite loop if the data is looped.
  • I argue that the GraphQL server should implement circuit breakers against abuse/errors, so I don't think it's a big deal to add recursive fragments.
  • Michael: currently client has control over depth, but with recursive fragments it's now the server's choice.
  • Roman: but what if the client doesn't care - they know it's limited.
  • Lee: this isn't the first time we've talked about this - must have been a while ago because I cannot track down the discussions.
  • Lee: it was added very deliberately, anecdotally we return trees not graphs and it was argued early on that we should call it TreeQL.
  • Introspection is a graph, but the response is always a tree.
  • Three major concerns:
    • infinite recursion; limitations wouldn't just be specification - how should it halt and when
    • data validation; returning the same value multiple times, how is that represented in the data. Ideally you want to detect it's cyclic and halt straight away, but some servers cannot detect this and may loop many times before they detect something went wrong and stop
    • what's the cost of not having this; does it warrant these problems? No it doesn't; it's always possible for you to specify the number of levels deep in your query - that's effectively the desugared version of what we'd do if we were to handle this in GraphQL
  • We'd have to be very careful that implementations don't DDoS themselves.
  • We considered adding it with a max depth, but we ended up making this really complex, and so we decided that manually unwinding was preferable.
  • Roman: these same problems apply to all programming languages and programming systems.
  • Michael: GraphQL is not a programming language.
  • Lee: this is really important - it essentially comes down to the halting problem - we want to be able to figure out what the complexity of a query is but if we add programming-language-esque features you run into the halting problem and you cannot know how complex it is to run it until you actually run it.
  • Matt: basil is a subset of python that's guaranteed to halt. It's hard from a language design perspective - you have to be very very crisp on what a loop means, what inputs to a loop can be. Could we possibly auto-unwrap things? Yes, but the amount of crisp-ness required is really really high.
  • Roman: static analysis is really limited, and trying to do it is hopeless.
  • Matt: but we do; GraphQL's goal is very different from an arbitrary programming language.
  • Lee: there's always going to be limitations to static analysis; there's things that we can know and things that we cannot - under the hood there's turing complete arbitrary execution going on. For a data query language - especially one that's publicly exposed... In your own programming language you're going to be carefully analyzing your code, but for an API you can't do that - you're accepting queries from other people. I don't think this is damning for fragment recursion.
  • Feedback/direction - rather than this being just a removal, what do we need to add to the spec to reach the safety that we have today? Can you raise more use cases for it to make it more convincing?
  • Roman: I'm a server-side developer, and I feel like the spec speaks too much about server-side execution, whereas it should focus on what the client wants to be delivered - not how.
  • Lee: there's a part of the spec very high up that says that server execution should be equivalent, the execution is more to give clients a mental model for how execution might work.

Client Controlled Nullability (Alex)