Skip to content

kvserver: replicate enqueue on span config update is expensive #108724

@kvoli

Description

@kvoli

Describe the problem

In #100349, we began enqueuing replicas (into the replicate queue) upon receiving span config updates. The problem is, in clusters with a larger number of replicas per node, the overhead of enqueuing replicas is significant—and occurs regularly, every 10 minutes when the PTS changes.

Expected behavior

Replicas are enqueued into the replicate queue, when there is a span config change which would cause a replication/lease change. The overhead of this enqueuing is less noticeable on nodes with 100k+ leaseholders.

Additional data / screenshots

PTS record updated on span configs every 10 minutes, which causes a spike in CPU due to ShouldPlanChange called on enqueuing into the replicate queue.

image
image

Environment:
Affects master, release-23.1 and release-23.1.9-rc

Jira issue: CRDB-30613

Metadata

Metadata

Assignees

Labels

A-kv-distributionRelating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV Teambranch-release-23.1Used to mark GA and release blockers, technical advisories, and bugs for 23.1release-blockerIndicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions