Skip to content

Conversation

@snaury
Copy link
Member

@snaury snaury commented Oct 29, 2024

Changelog entry

Fixed excessive read latency during and after some shard splits.

Changelog category

  • Bugfix

Additional information

It was observed that reads sometimes take seconds during frequent shard splits. Turns out shards replied with an OVERLOADED status even after split has already finished, which caused KQP to retry reads repeatedly with an exponential backoff, until eventually a guard condition (after multiple seconds) would cause read actor to finally re-resolve the table again. A correct NOT_FOUND status (which indicates the table no longer exists) fixes this problem.

Fixes #11036.

@snaury snaury self-assigned this Oct 29, 2024
@github-actions
Copy link

github-actions bot commented Oct 29, 2024

2024-10-29 13:37:44 UTC Pre-commit check for 65eb5db has started.
2024-10-29 13:40:23 UTC Build linux-x86_64-release-asan is running...
🟢 2024-10-29 14:04:30 UTC Build successful.
2024-10-29 14:10:45 UTC Tests are running...
🔴 2024-10-29 16:18:06 UTC Some tests failed, follow the links below.

Test history | Test log

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
10295 10201 0 21 27 46

🟢 2024-10-29 16:18:57 UTC ydbd size 5.6 GiB changed* by +1.7 KiB, which is < 100.0 KiB vs stable-24-3: OK

ydbd size dash stable-24-3: ccf6536 merge: 65eb5db diff diff %
ydbd size 5 989 519 536 Bytes 5 989 521 240 Bytes +1.7 KiB +0.000%
ydbd stripped size 1 501 276 256 Bytes 1 501 280 544 Bytes +4.2 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Oct 29, 2024

2024-10-29 13:38:06 UTC Pre-commit check for 65eb5db has started.
2024-10-29 13:40:46 UTC Build linux-x86_64-release-clang14 is running...
🟢 2024-10-29 13:47:52 UTC Build successful.

@github-actions
Copy link

github-actions bot commented Oct 29, 2024

2024-10-29 13:38:41 UTC Pre-commit check for 65eb5db has started.
2024-10-29 13:41:20 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-10-29 14:21:03 UTC Build successful.
2024-10-29 14:21:27 UTC Tests are running...
🔴 2024-10-29 15:47:10 UTC Some tests failed, follow the links below.

Test history | Test log

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14592 13239 0 12 1300 41

🟢 2024-10-29 15:48:01 UTC ydbd size 8.2 GiB changed* by +6.0 KiB, which is < 100.0 KiB vs stable-24-3: OK

ydbd size dash stable-24-3: ccf6536 merge: 65eb5db diff diff %
ydbd size 8 855 610 904 Bytes 8 855 617 024 Bytes +6.0 KiB +0.000%
ydbd stripped size 483 568 488 Bytes 483 569 064 Bytes +576 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@snaury snaury marked this pull request as ready for review October 29, 2024 15:14
@snaury snaury requested a review from a team as a code owner October 29, 2024 15:14
@snaury
Copy link
Member Author

snaury commented Oct 30, 2024

По зафейлившимся тестам, посмотрел более ранние PR в 24-3, там те же тесты фейлятся.

@snaury snaury merged commit c85b979 into ydb-platform:stable-24-3 Oct 30, 2024
8 of 12 checks passed
@snaury snaury deleted the bugfix-11036-slow-read-split-24-3 branch October 30, 2024 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants