Skip to content

Why is asyncpg doing type introspection on json types? #1206

Closed
@swanysimon

Description

@swanysimon
  • asyncpg version: 0.30.0
  • PostgreSQL version: 15.3
  • Do you use a PostgreSQL SaaS? If so, which? Can you reproduce
    the issue with a local PostgreSQL install?
    : RDS, and yes
  • Python version: 3.12.6
  • Platform: MacOS and Linux
  • Do you use pgbouncer?: No
  • Did you install asyncpg with pip?: No, poetry
  • If you built asyncpg locally, which version of Cython did you use?: n/a
  • Can the issue be reproduced under both asyncio and uvloop?: Have not tried. Happy to if you think it would be beneficial

Spinning out of #1138 (comment) because it feels like a different discussion.


I'm running a FastAPI service that connects to AWS RDS, and needs to refresh credentials every 15 minutes. Normally, the type introspection queries don't take up much time because they run once per connection, but I have a lot of churn in my connection pool so run them a decent number of times. Recently I'm seen more traffic and thus more connections being created, and with more connections, the more often we're likely to see slow queries on things that are normally fast.

At a very high level, my service is set to connect to the database with:

engine = create_async_engine(
    postgres_url(use_asyncpg=True),
    pool_size=10,
    max_overflow=25,
    pool_recycle=600,  # IAM credentials expire after 15 mins
    pool_pre_ping=True,
)

@event.listens_for(engine.sync_engine, "do_connect")
def provide_token(dialect, conn_rec, cargs, cparams) -> None:
    cparams["password"] = boto3.client("rds").generate_db_auth_token(
        config.POSTGRES_HOST, config.POSTGRES_PORT, config.POSTGRES_USER,
    )

Even abnormally slow type introspection queries aren't horrible but they are noticeable, as in the example below these 2 queries took more than 50% of the service's total response time.

Screenshot 2024-10-31 at 17 57 49

Debugging locally a little with command: ["postgres", "-c", "log_statement=all"] in my docker-compose.yml, I can see what type asyncpg needs to examine:

2024-11-01 20:52:52.239 UTC [491] LOG:  execute __asyncpg_stmt_1__: SELECT
            t.oid,
            t.typelem     AS elemtype,
            t.typtype     AS kind
        FROM
            pg_catalog.pg_type AS t
        WHERE
            t.oid = $1

2024-11-01 20:52:52.239 UTC [491] DETAIL:  parameters: $1 = '114'
2024-11-01 20:52:52.240 UTC [491] LOG:  execute __asyncpg_stmt_2__: SELECT
            t.oid,
            t.typelem     AS elemtype,
            t.typtype     AS kind
        FROM
            pg_catalog.pg_type AS t
        WHERE
            t.oid = $1

2024-11-01 20:52:52.240 UTC [491] DETAIL:  parameters: $1 = '3802'

These correspond to the JSON and JSONB types, respectively, not even custom types.


The actual question: how can I pre-register the JSON and JSONB types in each connection so I don't have to keep running the introspection query? I've tried the json_{de,}serializer argument to the SQLAlchemy engine, as well as trying to hook into SQLAlchemy events to intercept connection creation and set the codecs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions