Skip to content

Query Frontend scalability #1150

Closed
@tomwilkie

Description

@tomwilkie

We're staring to have enough sustained query load that the limit of having two query frontends is hurting us. The query frontends currently have to unmarshal the JSON from the querier (and the proto from the cache), combine them and then remarshal and send them.

The limit of two query frontends is so the queueing is fair. The more replicas you add, the more queues you'll end up with and the more it will degrade to random load balancing, the problem the frontend was designs to solve.

One idea is to put the queue in a separate service. This service would be responsible for queueing, scheduling, retires etc. The frontend would then only be responsible for accepting the queries, consulting the cache, working out what to enqueue, accepting the response, writing back to the cache and the user. This would allow the frontend to be stateless and scalable. The queries would communicate directly with the frontend service for the bulk of the response; the new queueing service would be control-path only.

WDYT?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions