Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphQLSchema serialization using pickle #173

Closed
rpgreen opened this issue Jun 2, 2022 · 24 comments · Fixed by #184
Closed

GraphQLSchema serialization using pickle #173

rpgreen opened this issue Jun 2, 2022 · 24 comments · Fixed by #184
Assignees
Labels
enhancement Enhancement request
Milestone

Comments

@rpgreen
Copy link

rpgreen commented Jun 2, 2022

Is it possible to serialize an instance of GraphQLSchema?

We would like to dump and load an instance of GraphQLSchema to/from a file for performance reasons.

It does not appear to be serializable using pickle by default.

>       res = pickle.dumps(schema)
E       AttributeError: Can't pickle local object 'extend_schema_impl.<locals>.build_object_type.<locals>.<lambda>'

Thanks in advance.

@Cito
Copy link
Member

Cito commented Jun 2, 2022

It's currently not supported. You can serialize a schema to SDL using print_schema. But you can't pickle a schema because of the resolvers functions and also because GraphQL-core uses lambdas a lot, and these can't be pickled.

Will keep this open as a reminder to maybe/someday investigate whether supporting this would be feasible somehow without totally reinventing GraphQL-Core. If anybody wants to work on this or has comments or ideas how to solve this, please feel free to jump in.

@Cito Cito changed the title GraphQLSchema serialization GraphQLSchema serialization using pickle Jun 2, 2022
@Cito Cito added help wanted Extra attention is needed discussion Needs more discussion investigate Needs investigaton or experimentation labels Jun 2, 2022
@erikwrede
Copy link
Member

@rpgreen Out of curiosity, where do you expect performance gains when serializing your schema?

@rpgreen
Copy link
Author

rpgreen commented Jun 2, 2022

We are using build_schema() at runtime and there's a significant performance hit (several hundreds ms up to several seconds)

I assume it would be much faster to rehydrate a serialized schema object if that was possible.

@Cito
Copy link
Member

Cito commented Jun 2, 2022

@rpgreen Interesting. Maybe it would be worthwile then to profile why/where build_schema() is so slow and whether there is potential for optimization. First step would be to find out whether it's the parse() step or the build_ast_schema() step of build_schema() that is the main problem.

@erikwrede
Copy link
Member

erikwrede commented Jun 2, 2022

I did a cProfile profiling run on the current GitLab GQL schema. You can easily download it using Altair, selecting https://gitlab.com/api/graphql and running Export SDL in the docs section.

These are the Insights:
The Schema File has 51k lines, including blanks and lots of documentation.

Building the schema took a total of 3.8 seconds using cProfile on python 3.10, of which 2.67s were spent on build_ast_schema, and 1.16s on parse. In total, 52k nodes were parsed.
Most of the building time (2.4s) are spent during visit, with the nested calls taking 2s and the function itself taking 430ms:

image

These are the top bottlenecks (see Own Time):
image
get_visit_fn is by far the largest bottleneck (by accumulated time), however, it is called quite frequently. __init__ is the init function of Node, where the **kwargs are traversed and transformed into FrozenList.
Inside of parse I couldn't find similar bottlenecks, but I'm not familiar with the code of that part of the library (yet).
Maybe (pure speculation) there is a way to optimize the amount of get_visit_fn calls (1.3 million for 52k Nodes, but there are multiple types of visit functions per Node), but that would probably require significant rewriting of the entire build_schema part.

Maybe this helps someone more familiar with the internals of core to identify potential issues. Most of the long durations here simply seem to be caused by a large number of nodes in the schema, not significant problems with the parsing/validation logic.

Keep in mind that this analysis is based on a quick cProfile iteration, do not take this as scientific evidence 😉

@Cito
Copy link
Member

Cito commented Jun 4, 2022

Thanks @erikwrede for the profiling. Did you use the latest version (3.2.1) of GraphQL-core? I'm asking because there have been some speedups and get_visit_fn is actually deprecated.

@Cito
Copy link
Member

Cito commented Jun 5, 2022

@rpgreen I have looked into this a bit more. It seems you want to pickle only schemas that have been produced from SDL, which makes this issue a bit more restricted and easier to tackle.

The reason why this currently does not work is the implementation in extend_schema.py which heavily relies on lamba and local functions, which cannot be pickled. Refactoring this is probably not so easy, and would also deviate from what GraphQL.js is doing, which also uses anonymous and local functions. If anybody has ideas how to solve this, please comment.

Also, I'm still not sure how much faster pickling really would be, and whether improving the performance of build_schema() would not be more worthwile.

@Cito
Copy link
Member

Cito commented Jun 5, 2022

Btw, I have added a test for pickling schemas here which is currently skipped.

@erikwrede
Copy link
Member

erikwrede commented Jun 5, 2022

Just checked, seems like I messed up the version in my test env. The new version sees a performance improvement roughly equal to removing get_visit_fn from the statistics, parsing now takes 55% of the time instead of a third of the total run time. Might look into this further soon. Maybe there is some improvement potential in the DFS of visit, 400 ms own time for 50k nodes seems like a lot for a DFS (even using a profiler which is slowing things down).

@Cito
Copy link
Member

Cito commented Jun 6, 2022

@rpgreen @erikwrede I have now created a branch pickle-schema that allows serialization of a schema built from SDL. It may be a bit slower when building schemas, as it uses a class instead of local functions and therefore needs attribute lookups, but it does not seem to be a big difference. Please check it out and give me feedback whether you think this should go to the main branch.

@rpgreen
Copy link
Author

rpgreen commented Jun 6, 2022

@Cito @erikwrede Thanks for looking into this. We will try this out and share our results.

From our perspective, we would be fine with a higher one-time cost of building the schema as long as we have a way to quickly serialize/deserialize at runtime.

@rpgreen-vendia
Copy link

Hi folks,
We've done some prototyping with this and we are seeing an improvement. Can we get this merged into the main branch?

@Cito
Copy link
Member

Cito commented Jul 25, 2022

@rpgreen-vendia thanks for the feedback. Will consider this when working on the next version. Please have some patience as I cannot work on this full time and am currently deep in other projects. Feedback from others is also appretiated.

@mika-vendia
Copy link

mika-vendia commented Aug 18, 2022

Hi there,
Thank you for working on this feature. I made a POC with the pickle-schema branch and I'm seeing major performance improvements with this change but there are also some issues that have come up.

  1. I see that graphql-core has its own implementation of deepcopy that was not updated as part of the pickle-schema branch. Currently the issue I get when I try to deepcopy a previously pickled schema is
>               raise TypeError(
                    "Schema must contain uniquely named types"
                    f" but contains multiple types named '{type_name}'."
                )
E               TypeError: Schema must contain uniquely named types but contains multiple types named 'String'.

../.venv/lib/python3.9/site-packages/graphql/type/schema.py:258: TypeError
  1. When doing the introspection query, the introspection result omits the types field for a previously pickled schema. Sample code below:
compiled_gql = build_schema(gql_schema)
pickled = pickle.dumps(compiled_gql)
unpickled = pickle.loads(pickled)
result = introspection_from_schema(unpickled)
resource_path_2 = Path("/Users/mmeyers/some_directory")
with open(resource_path_2 / "introspection.json", "w") as to_file:
     to_file.write(json.dumps(result))

Please let me know if I can give more detail on any of this and/or if I'm doing any part of this wrong. Thanks again!

Mika

@Cito
Copy link
Member

Cito commented Aug 18, 2022

Thanks for the feedback @mika-vendia . Will look into this.

@Cito Cito added the wip Work in progress label Sep 22, 2022
@Cito Cito added this to the v3.3 milestone Sep 22, 2022
@Cito
Copy link
Member

Cito commented Sep 25, 2022

I have started working on this issue again. First step was to update the pickle-schema branch by merging in the latest changes from main, because I plan to eventually release this as v3.3.

@mika-vendia I have also looked into your point 1 (deepcopy of a pickled schema). I was able to create a unit test reproducing the problem. The cause is clear - the unpickled standard scalar type objects are different from the original ones. I have fixed this now by replacing them with the original ones when creating the schema. The unit test passes. Maybe you can check out if point 1 is also solved for you with the current pickle-schema branch.

Will continue to work on this later this week. Maybe there is a better way of solving this, more fundamentally by somehow guaranteeing that singletons are preserved in the pickle/unpickle process. Will also look into your point 2.

@mika-vendia
Copy link

Thank you, @Cito! I will test out issue 1 and let you know how it goes.

@danvendia
Copy link

@Cito - @mika-vendia has confirmed that the fix for the first issue is working as expected. Let us know if you need any more details on the second issue.

@Cito
Copy link
Member

Cito commented Oct 23, 2022

@danvendia I have solved the issue now a bit more fundamentally in this branch and extended the tests. Should still be working. Wil try to tackle the second issue next week.

@Cito
Copy link
Member

Cito commented Nov 1, 2022

@danvendia Update: I have looked into the second issue, can reproduce it and have an idea how to solve it, even more fundamentally than the last fix. Will continue to work on it this week.

@danvendia
Copy link

@Cito - awesome! Thanks for the update!

@Cito Cito removed the investigate Needs investigaton or experimentation label Nov 2, 2022
@Cito Cito added enhancement Enhancement request and removed wip Work in progress help wanted Extra attention is needed discussion Needs more discussion labels Nov 2, 2022
@Cito Cito self-assigned this Nov 2, 2022
@Cito Cito linked a pull request Nov 2, 2022 that will close this issue
Cito added a commit that referenced this issue Nov 2, 2022
@Cito Cito closed this as completed in #184 Nov 2, 2022
@Cito
Copy link
Member

Cito commented Nov 2, 2022

@danvendia The complete fix has now been merged into the main branch and will be available with the next alpha release. Please open a new issue if there are still any problems with this.

@Cito
Copy link
Member

Cito commented Nov 3, 2022

@danvendia @mika-vendia @rpgreen-vendia: This is now available in v3.3.0a2 for testing.

@Ambro17
Copy link

Ambro17 commented Nov 4, 2022

Just came by to say you rock @Cito 🎸

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants