Skip to content

Limit the number of concurrent shard snapshots #89826

Open
@tlrx

Description

@tlrx

Description

Since #56911 we can create (or delete) snapshots concurrently and we are limited to 1000 operations at a time. But we don't have any limit on the number of shards these snapshots can contain, and in a cluster with many shards this can end up with hundred thousands shards waiting to be snapshotted.

I think we could introduce a limit on the maximum number of shards a cluster can snapshot a a time and reject any new snapshot creation that would cause this limit to be exceeded (without adding it to the cluster state as a new snapshot-in-progress entry).

This would also serve as a cheap back-pressure mechanism in case aggreassive SLM policies are creating new snapshots faster than the cluster can snapshot the shards.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions