-
Notifications
You must be signed in to change notification settings - Fork 1.1k
validation perf improvements #2563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validation perf improvements #2563
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR. The fact that no tests had to be changed and we have a benchmark run for this gives me great confidence that it will do what is promised
One interesting thing we have lesrnt about graphql is that queries are more pathological then we think.
More and more front ends are "generating" their queries via "fragment decls" and hence computers end up writing horrendously complicated queries.
And hence parsing / validation has over time become more of a contention point= than it used to be.
high level comment: I didn't have time to look really deeper: we are very happy with these kind of improvements in general. Performance is one of our main focus areas. One thing: I don't really trust our validation tests as much as in other parts in our code base. One specific concern: if we have fragments which are used multiple times, will we miss something? |
Thanks for the feedback @andimarek! I took the theme of your comment (validation for multi-op docs) and took a closer look at the tests for NoFragmentCycles and rules that traverse fragments spreads. I added some new tests which helped find an issue for documents with multiple operations. The proposed behavior is now to visit a fragment definition only once per operation traversal. Are there other areas or cases I should consider here? |
That sounds good. I will give it a closer look probably soon, but it may take one or two weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this work and sorry for the long delay of my review.
We have 3 rules which want to follow fragments spreads when traversing the document:
- No Undefined variables
- No unused variables
- Variables types match rule
As far as I can tell the only reason they need to follow fragment spreads is because they need to consider the current operation: one fragment can be reused across multiple operations in a document. So each of these rules need to check every variable usage in respect to each Operation it is used in.
But this also mean that for every Operation we only need to visit each Fragment definition once, right?
The code seems overall complex for that.
Can we simplify it by making simply sure we don't visit the same Fragment Definition multiple times per Operation? What do you think?
Edit:
Looking more into it the whole stacks we maintain are only ever needed for Fragment Definition in/out and we actually only ever have two different sets of rules: the ones following fragment spreads and the ones who don't.
I think the original code was trying to be to clever and to generic and I think we should be more explicit and less clever here. These rules are basically fixed for years and change very rarely.
…isitor-optimizations
…isitor-optimizations
make RulesVisitor a little less abstract. No rule stacks, and slightly more explicit knowledge about the nature of rules
Thanks for the feedback @andimarek! Agreed that RulesVisitor could be a bit simpler. I've reworked that class to not use a stack internally, though I found it necessary to introduce some new mutable state that was hard to avoid. Overall it feels like this attempt for simplicity may have been a wash but I'd appreciate your feedback here.
This was spot on. |
Thanks a lot @jbellenger. |
Howdy from Twitter! We're in the final stages of migrating our graphql api to graphql-java. We're very excited about the performance and stability we've measured in graphql-java as well as the healthy state of this project and it's broad adoption in the industry.
We've seen that for some of our larger queries we're spending over 5s in query validation. It looks like most of that time is caused by RulesVisitor traversing from fragment spreads into fragment definitions and that that it will revisit fragments that it has already checked. I suspect this has ~exponential complexity for worst-case fragment graphs.
We humbly propose this PR to improve RulesVisitor performance. This changes it to roughly linear complexity and shows a 5000% speedup when validating our large queries. I've included a jmh benchmark which shows a ~1800% speedup on one the existing large schema/query resources:
This diff is slightly noisy. The crux of it is:
checkFragmentSpread
, don't traverse into fragment definitions that have already been checkedRulesVisitor
, it helps to be able to restore state from previous scopes. This diff goes about this by wrapping some of the properties in ajava.util.Stack
I'm open to literally any/all feedback here. Thanks gang.