Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues related to complete_value on large datasets #189

Open
JCatrielLopez opened this issue Jan 11, 2023 · 3 comments
Open

Comments

@JCatrielLopez
Copy link

JCatrielLopez commented Jan 11, 2023

Hi! We've noticed that returning a list of 5k elements, with a couple of nested objects is pretty slow:

Person {
  id
  name
  lastname
  age
  address {street number}
  job {id org_name}
  partner {id name}
  pets {name type}
  school {id name}
}

image

ncalls tottime percall cumtime percall filename:lineno(function)
1 1.8e-05 1.8e-05 2.145 2.145 graphql.py:103(graphql_sync)
1 1.5e-05 1.5e-05 2.145 2.145 graphql.py:152(graphql_impl)
1/30001 0.19 6.335e-06 2.137 7.122e-05 execute.py:413(ExecutionContext.execute_fields)
1 1.3e-05 1.3e-05 2.137 2.137 execute.py:965(execute)
1 7e-06 7e-06 2.137 2.137 execute.py:328(ExecutionContext.execute_operation)
1/135001 0.3229 2.392e-06 2.135 1.581e-05 execute.py:485(ExecutionContext.execute_field)
1/145001 0.2737 1.888e-06 2.071 1.428e-05 execute.py:575(ExecutionContext.complete_value)
1 0.009884 0.009884 2.071 2.071 execute.py:660(ExecutionContext.complete_list_value)
5000/30000 0.02747 9.156e-07 2.026 6.752e-05 execute.py:893(ExecutionContext.complete_object_value)

By itself it's not really a slow function, but its executed 30k times. Is there any way to reduce the overhead by reducing the number of times this function is invoked?

Tested on Python 3.8 and graphql-core==3.2.3

@JCatrielLopez JCatrielLopez changed the title Performance issues with complete_value on large datasets Performance issues related to complete_value on large datasets Jan 11, 2023
@JCatrielLopez
Copy link
Author

Possibly related to this graphql-js issue

@Cito
Copy link
Member

Cito commented Jan 11, 2023

Thanks for reporting. Will look into this when I have more time, probably only after releasing 3.3. It would be helpful if you could post example code with dummy data to reproduce this.

@JCatrielLopez
Copy link
Author

JCatrielLopez commented Jan 12, 2023

schema.graphql:

type Query {
    persons: [Person]
}

type Person {
    id: String!
    name: String
    ssn: String
    alive: Boolean
    has_job: Boolean
    job: JobDetails
    address: Address
    pets: Address
    house: House
    partner: Person
}

type JobDetails {
    id: String
    name: String
}

type Address {
    id: String
    name: String
}

type Pets {
    id: String
    name: String
    race: String
    color: String
}

type House {
    color: String
    floors: Int
    is_duplex: Boolean
    is_apt: Boolean
}

server.py:

import random
import string
import sys

import yappi

from graphql import graphql_sync, build_ast_schema
from graphql.language.parser import parse

yappi.set_clock_type("wall")

with open("./schema.graphql", "r") as f:
    schema = build_ast_schema(parse(f.read()))


class Query:
    """The root resolvers"""

    def persons(self, info):
        output = []
        for _ in range(5_000):
            output.append(
                dict(
                    id="".join(random.choices(string.ascii_lowercase + string.digits, k=9)),
                    name=f"John Doe",
                    ssn="00000000000000000",
                    alive=True,
                    has_job=False,
                    job=dict(id="xxx", name="test"),
                    address=dict(id="yyy", name="Fake Street"),
                    pets=dict(id="zzz", name="test"),
                    house=dict(
                        color="RED",
                        floors=2,
                        is_duplex=False
                    ),
                    partner=dict(id="".join(random.choices(string.ascii_lowercase + string.digits, k=9)), name="test"),
                )
            )
        return output


def main():
    query = """{ 
        persons{ 
            id 
            name 
            alive 
            has_job 
            job{id name}
            partner{id name}
            address{id name}
            pets{id name}
            house{color floors is_duplex}
        } 
    }"""

    yappi.start()
    result = graphql_sync(schema, query, Query())
    yappi.stop()

    if result.errors:
        print(result)
        sys.exit(1)

    yappi.get_func_stats().save("profile", type="pstat")


# To visualize profile:
# python -m snakeviz profile --server

if __name__ == '__main__':
    main()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants