-
-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant performance hit when using async resolvers #190
Comments
Thanks, will definitely look into that. For clarification: When you say degradation, then in comparison to what? Smaller results, sync resolvers, or earlier versions of GraphQL-core or Python? |
Compared with sync resolver. The field |
Just to clarify, on #189 we didn't use async resolvers |
Did some performance tests and it seems So based on these results I created the following monkey patch that gives us in our current production code 2x speed improvement: execute = import_module(".execution.execute", "graphql")
async def serial_gather(*futures):
return [await future for future in futures]
execute.gather = serial_gather NOTE: These tests are based on a CPU bound "resolver" so any optimizations have to be checked to a more normal workload with maybe a mix of CPU bound and IO bound resolvers. That said, as @JCatrielLopez indicated, there is also slowdowns in sync code so I think there is a lot of performance left on the table. @Cito how could we tackle investigations into increasing performance of |
@kevinvalk thats a really good find and it would definitely help the cases with simple async resolvers. However it's worth pointing out that as soon as your resolvers are actually doing anything that might take some time (i.e. HTTP requests or file access) using the serial gather method will slow things down because the functions aren't running in parallel. I've written benchmark to show that (the async function sleeps for 50 milliseconds on every 1000th iteration): https://perfpy.com/242 There might be an argument to allowing developers to configure how async results are gathered but why are you using async functions for purely synchronous work anyway @kevinvalk ? Async functions will always have an overhead compared to sync functions. |
Thank you all for your input. @kevinvalk The way that However, I think we all agree that there is room for improvement for supporting different strategies, maybe in form of optimization hints for the execution algorithm. At least, it should be possible to specify whether a serial or parallel execution strategy is used. Currently, only mutations are resolved serially, and you have no way to force queries to be executed serially (just like in GraphQl.js). Note that GraphQL-core already detects when all (sub)field resolvers are synchronous and does not try to |
@Cito One optimisation that we could make is to never use asyncio.gather when there is only 1 coroutine since it wouldn't provide any benefit. It would help the example in this issue a lot but I'm not sure how helpful it would be for most real use cases. What do you think? |
@jkimbo Yes, that's a simple thing I can implement. The additional check is probably cheap compared to the overhead of running |
See discussion in issue #190. Also, run coverage in Python 3.10, since running in Python 3.11 reports false negatives in test_lists (probably bug in coverage).
Hey there, I've noticed something related to this when using graphql middlewares with async execution: |
@Cito I monkey patched your change into our product and it (as you also expected) does not do anything :'( The synthetical nature of the problem is not really so synthetical as we are using https://github.com/syrusakbary/aiodataloader for the GraphQL N+1 problem (and way more efficient database queries). This means that only a few times we will have an actual database lookup (IO bound), and all other times we have direct cache lookups (CPU bound) that just take the objects from the cache. However, all resolves have to be async to get this to work. So I was wondering if maybe my approach is wrong or if I am missing something else. Any ideas are more than welcome! EDIT: After posting this I used Note that I still need serial await in GraphQL core as well, if I undo the monkey patch I go from 500ms -> 2s per request. execute = import_module(".execution.execute", "graphql")
aiodataloader = import_module("aiodataloader")
async def serial_gather(*futures: Awaitable[Any]):
return [await future for future in futures]
aiodataloader.gather = execute.gather = serial_gather # type: ignore |
@kevinvalk thanks for the feedback. In fact if you're using a lot of caching and the work starts to become CPU bound then the price for async can be heavy and diminish the advantage of caching. See also Wall vs CPU time, or the cost of asyncio Tasks and CPU Bound code runs 3x-5x slower with asyncio. I currently do not have a good idea to solve or mitigate this problem and honestly speaking I don't want to spend too much time on optimizations - the main goal of GraphQL-core is to be a faithful Python port of the reference implementation GraphQL-js and stay up to date with its latests version in functionality, not to be particularly fast. If performance is needed, then there is only so much you can do with pure Python. Btw, there is also the open issue #166 - I wonder if using anyio with trio would help, but I fear it may become even slower through the use of anyio as another layer. |
We had exactly the same problem and found out that most of the time is spend in garbage collection (gc_collect_main).
If I disable garbage collection the program runs in half of the time. if __name__ == "__main__":
import gc
gc.disable()
run(main, is_profile=False)
As @Cito already mentioned Perhaps this information helps to find some solution to this problem. |
Thanks for the valuable input @QSHolzner |
During development (Starlette + Ariadne) of our product I noticed significant performance degradation when GraphQL responses got significant long (1000+ entities in a list). I started profiling and drilling into the issue and I pinpointed it to
async
resolvers. Whenever a resolver is async and it is called a lot (100.000) you can see significant slowdowns 4x-7x, even if there is nothing async to it.The question that I ended up with, is this a limitation of Python asyncio or how the results are gathered for async fields in graphql execute?
Any insight/help is greatly appreciated as we really need more performance and changing to sync is not really an option (nor is rewriting it to another language) 😭
Versions:
The text was updated successfully, but these errors were encountered: