Skip to content

DataShard: read actors retry TEvRead to shards long after they finished splitting #11036

@snaury

Description

@snaury

Looks like DataShard currently replies with a very generic error:

    if (!IsStateNewReadAllowed()) {
        replyWithError(
            Ydb::StatusIds::OVERLOADED,
            TStringBuilder() << "Shard " << TabletID() << " is splitting/merging"
                << " (node# " << SelfId().NodeId() << " state# " << DatashardStateName(State) << ")");
        return;
    }

Which causes ReadActor (LookupActor?) to keep retrying with an expotential backoff without resolving new partitioning. I've seen this causing 3+ seconds latency (due to expotential backoff) until it finally requests the new shard.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions