validation perf improvements #2563

jbellenger · 2021-09-27T13:06:33Z

Howdy from Twitter! We're in the final stages of migrating our graphql api to graphql-java. We're very excited about the performance and stability we've measured in graphql-java as well as the healthy state of this project and it's broad adoption in the industry.

We've seen that for some of our larger queries we're spending over 5s in query validation. It looks like most of that time is caused by RulesVisitor traversing from fragment spreads into fragment definitions and that that it will revisit fragments that it has already checked. I suspect this has ~exponential complexity for worst-case fragment graphs.

We humbly propose this PR to improve RulesVisitor performance. This changes it to roughly linear complexity and shows a 5000% speedup when validating our large queries. I've included a jmh benchmark which shows a ~1800% speedup on one the existing large schema/query resources:

ValidatorBenchmark.oldRuleVisitor  avgt   30  209.684 ± 23.927  ms/op
ValidatorBenchmark.newRuleVisitor  avgt   30   38.141 ±  4.131  ms/op

This diff is slightly noisy. The crux of it is:

in checkFragmentSpread, don't traverse into fragment definitions that have already been checked
doing this requires keeping a little more state around. It helps to reuse the existing RulesVisitor rather than creating a new one for each FragmentDefinition traversal
when working with only one RulesVisitor, it helps to be able to restore state from previous scopes. This diff goes about this by wrapping some of the properties in a java.util.Stack

I'm open to literally any/all feedback here. Thanks gang.

src/test/groovy/graphql/validation/rules/NoUnusedVariablesTest.groovy

bbakerman

Thank you for this PR. The fact that no tests had to be changed and we have a benchmark run for this gives me great confidence that it will do what is promised

One interesting thing we have lesrnt about graphql is that queries are more pathological then we think.

More and more front ends are "generating" their queries via "fragment decls" and hence computers end up writing horrendously complicated queries.

And hence parsing / validation has over time become more of a contention point= than it used to be.

src/main/java/graphql/validation/RulesVisitor.java

andimarek · 2021-10-04T23:48:55Z

high level comment: I didn't have time to look really deeper: we are very happy with these kind of improvements in general. Performance is one of our main focus areas.

One thing: I don't really trust our validation tests as much as in other parts in our code base.

One specific concern: if we have fragments which are used multiple times, will we miss something?
For example: cycle detection. If we have a fragment which causes a cycle, but it is used multiple times. Could that happen? (I am not sure, hypothetical example)

jbellenger · 2021-10-07T10:25:28Z

Thanks for the feedback @andimarek!

I took the theme of your comment (validation for multi-op docs) and took a closer look at the tests for NoFragmentCycles and rules that traverse fragments spreads. I added some new tests which helped find an issue for documents with multiple operations.

The proposed behavior is now to visit a fragment definition only once per operation traversal. Are there other areas or cases I should consider here?

andimarek · 2021-10-07T21:07:13Z

That sounds good. I will give it a closer look probably soon, but it may take one or two weeks.

andimarek

Thanks a lot for this work and sorry for the long delay of my review.

We have 3 rules which want to follow fragments spreads when traversing the document:

No Undefined variables
No unused variables
Variables types match rule

As far as I can tell the only reason they need to follow fragment spreads is because they need to consider the current operation: one fragment can be reused across multiple operations in a document. So each of these rules need to check every variable usage in respect to each Operation it is used in.

But this also mean that for every Operation we only need to visit each Fragment definition once, right?

The code seems overall complex for that.
Can we simplify it by making simply sure we don't visit the same Fragment Definition multiple times per Operation? What do you think?

Edit:
Looking more into it the whole stacks we maintain are only ever needed for Fragment Definition in/out and we actually only ever have two different sets of rules: the ones following fragment spreads and the ones who don't.

I think the original code was trying to be to clever and to generic and I think we should be more explicit and less clever here. These rules are basically fixed for years and change very rarely.

src/main/java/graphql/validation/RulesVisitor.java

…isitor-optimizations

make RulesVisitor a little less abstract. No rule stacks, and slightly more explicit knowledge about the nature of rules

jbellenger · 2022-01-29T01:07:02Z

Thanks for the feedback @andimarek!

Agreed that RulesVisitor could be a bit simpler. I've reworked that class to not use a stack internally, though I found it necessary to introduce some new mutable state that was hard to avoid. Overall it feels like this attempt for simplicity may have been a wash but I'd appreciate your feedback here.

Looking more into it the whole stacks we maintain are only ever needed for Fragment Definition in/out and we actually only ever have two different sets of rules: the ones following fragment spreads and the ones who don't.

This was spot on.

andimarek · 2022-02-03T23:18:09Z

Thanks a lot @jbellenger.

.

c73bd62

bbakerman requested a review from andimarek October 1, 2021 08:07

bbakerman reviewed Oct 2, 2021

View reviewed changes

src/test/groovy/graphql/validation/rules/NoUnusedVariablesTest.groovy Show resolved Hide resolved

bbakerman approved these changes Oct 2, 2021

View reviewed changes

jbellenger commented Oct 2, 2021

View reviewed changes