Skip to content

save the WriteId + PartitionId #1037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Feb 13, 2024

Conversation

Alek5andr-Kotov
Copy link
Collaborator

  • для пары WriteId, PartitionId создаётся служебная партиция
  • идентификатор партиции сохраняется между рестартами таблетки PQ
  • служебная партиция восстанавливается на старте таблетки PQ
  • таблетка PQ подписывается на WriteId в LongTxService

Copy link

github-actions bot commented Jan 16, 2024

Note

This is an automated comment that will be appended during run.

🔴 linux-x86_64-relwithdebinfo: some tests FAILED for commit 054477b.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
59801 50556 0 15 9217 13

🔴 linux-x86_64-release-asan: some tests FAILED for commit 054477b.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
15880 15739 0 26 104 11

@@ -1023,6 +1023,14 @@ Y_UNIT_TEST_F(Cancel_Tx, TPQTabletFixture)
WaitForPQWriteTxs();
}

//Y_UNIT_TEST_F(Write_In_Tx, TPQTabletFixture)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it please

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
f92290e

};

message TTabletTxInfo {
optional uint64 LastStep = 2;
optional uint64 LastTxId = 3;
}

message TTabletTxWrites {
repeated uint64 WriteIds = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transponate it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
9181228

@@ -1210,17 +1317,29 @@ void TPersQueue::Handle(TEvPQ::TEvTabletCacheCounters::TPtr& ev, const TActorCon
<< "Counters. CacheSize " << CacheCounters.CacheSizeBytes << " CachedBlobs " << CacheCounters.CacheSizeBlobs);
}

bool TPersQueue::AllPartitionsInited() const
{
return PartitionsInited == (Partitions.size() + ShadowPartitions.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. You should track only regular partitions here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
048869b

@@ -830,6 +832,101 @@ void TPersQueue::ReadTxInfo(const NKikimrClient::TKeyValueResponse::TReadResult&
LOG_DEBUG_S(ctx, NKikimrServices::PERSQUEUE, "Tablet " << TabletID() << " LastStep " << LastStep << " LastTxId " << LastTxId);
}

void TPersQueue::InitPlanStep(const NKikimrPQ::TTabletTxInfo& info)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use default parameters info = {} and next func will be not required any more

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
102de71

LOG_INFO_S(ctx, NKikimrServices::PERSQUEUE, "Tablet " << TabletID() << " has a tx writes info");

NKikimrPQ::TTabletTxWrites info;
Y_ABORT_UNLESS(info.ParseFromString(read.GetValue()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to poison pill

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
fdb95e3

HandleReserveBytesRequest(responseCookie, actorId, req, ctx, pipeClient, sender);
}

void TPersQueue::HandleWriteRequestForShadowPartition(const ui64 responseCookie,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shadow->Supportive

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
4250769

} else if (req.CmdWriteSize()) {
HandleWriteRequestForShadowPartition(responseCookie, req, ctx);
} else {
Y_ABORT("CmdGetOwnership, CmdReserveBytes or CmdWrite expected");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reply with error, not abort!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -3031,6 +3309,7 @@ void TPersQueue::EndWriteTxs(const NKikimrClient::TResponse& resp,

SendReplies(ctx);
CheckChangedTxStates(ctx);
ForwardGetOwnershipToShadowPartitions(ctx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this func to CreateSupportivePartitions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
74c832f

partition.GetOwnershipRequests.emplace_back(params.Cookie, params.Request, params.Sender);

if (txWrite.LongTxSubscriptionStatus == NKikimrLongTxService::TEvLockStatus::STATUS_UNSPECIFIED) {
SubscribeWriteId(writeId, ctx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subscribe to write id when got new writeid, not when in persisted

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
74c832f

{
auto& record = ev->Get()->Record;
ui64 writeId = record.GetLockId();
if (TxWrites.contains(writeId)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If lockStatus == NO TX and Writeid is presented in memory state and no propose in memory state - delete it and all partitions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change deletion of writeID to :

  1. Move write id to inmemory map to_be_deleted_write_id -> shadow_partitions
  2. Send to all shadow partitions event ClearAllAndDie, wait for results
  3. when all results for partitions in writeId received, shedule deletion of all info for write id in tablet state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to do it in a separate task. There I described the above algorithm
LOGBROKER-8871

Copy link

github-actions bot commented Jan 26, 2024

2024-01-26 09:46:55 UTC Pre-commit check for 4d61b04 has started.
2024-01-26 09:46:58 UTC Build linux-x86_64-release-asan is running...
2024-01-26 10:02:21 UTC Check cancelled

Copy link

github-actions bot commented Jan 26, 2024

2024-01-26 09:47:04 UTC Pre-commit check for 4d61b04 has started.
2024-01-26 09:47:06 UTC Build linux-x86_64-relwithdebinfo is running...
2024-01-26 10:02:22 UTC Check cancelled

Copy link

github-actions bot commented Jan 26, 2024

2024-01-26 10:03:22 UTC Pre-commit check for b03cac9 has started.
2024-01-26 10:03:24 UTC Build linux-x86_64-release-asan is running...
🟢 2024-01-26 10:27:42 UTC Build successful.
2024-01-26 10:27:59 UTC Tests are running...
🔴 2024-01-26 12:04:41 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
15954 15830 0 54 47 23

Copy link

github-actions bot commented Jan 26, 2024

2024-01-26 10:05:03 UTC Pre-commit check for b03cac9 has started.
2024-01-26 10:05:04 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-01-26 10:33:52 UTC Build successful.
2024-01-26 10:34:02 UTC Tests are running...
🔴 2024-01-26 12:09:14 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
60185 50852 0 40 9250 43

Copy link

github-actions bot commented Jan 31, 2024

2024-01-31 18:24:25 UTC Pre-commit check for 0e0dd39 has started.
2024-01-31 18:24:28 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-01-31 19:02:40 UTC Build successful.
2024-01-31 19:02:53 UTC Tests are running...
🔴 2024-01-31 20:29:27 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
60256 50909 0 40 9268 39

Copy link

github-actions bot commented Jan 31, 2024

2024-01-31 18:28:00 UTC Pre-commit check for 0e0dd39 has started.
2024-01-31 18:28:01 UTC Build linux-x86_64-release-asan is running...
🟢 2024-01-31 19:06:30 UTC Build successful.
2024-01-31 19:06:42 UTC Tests are running...
🔴 2024-01-31 20:40:43 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
16036 15888 0 54 54 40

Copy link

github-actions bot commented Feb 9, 2024

2024-02-09 07:33:45 UTC Pre-commit check for 01667bf has started.
2024-02-09 07:33:48 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-09 07:52:04 UTC Build successful.
2024-02-09 07:52:16 UTC Tests are running...
🔴 2024-02-09 08:27:16 UTC Test run completed, no test results found for commit d2b1192. Please check build logs.
2024-02-09 08:27:19 UTC Check cancelled

Copy link

github-actions bot commented Feb 9, 2024

2024-02-09 07:33:47 UTC Pre-commit check for 01667bf has started.
2024-02-09 07:33:49 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-09 07:50:52 UTC Build successful.
2024-02-09 07:51:07 UTC Tests are running...
🔴 2024-02-09 08:27:10 UTC Test run completed, no test results found for commit d2b1192. Please check build logs.
2024-02-09 08:27:13 UTC Check cancelled

Copy link

github-actions bot commented Feb 9, 2024

2024-02-09 08:30:44 UTC Pre-commit check for 83a27f1 has started.
2024-02-09 08:30:49 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-09 08:35:58 UTC Build successful.
2024-02-09 08:36:13 UTC Tests are running...
🔴 2024-02-09 09:29:34 UTC Test run completed, no test results found for commit 048869b. Please check build logs.
2024-02-09 09:29:38 UTC Check cancelled

Copy link

github-actions bot commented Feb 9, 2024

2024-02-09 08:30:53 UTC Pre-commit check for 83a27f1 has started.
2024-02-09 08:30:56 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-09 08:36:33 UTC Build successful.
2024-02-09 08:36:46 UTC Tests are running...
🔴 2024-02-09 09:29:34 UTC Test run completed, no test results found for commit 048869b. Please check build logs.
2024-02-09 09:29:37 UTC Check cancelled

Copy link

github-actions bot commented Feb 9, 2024

2024-02-09 09:31:01 UTC Pre-commit check for 853eece has started.
2024-02-09 09:31:03 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-09 10:11:30 UTC Build successful.
2024-02-09 10:11:42 UTC Tests are running...
🔴 2024-02-09 11:44:06 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67410 56476 0 3 10892 39

Copy link

github-actions bot commented Feb 9, 2024

2024-02-09 09:32:02 UTC Pre-commit check for 853eece has started.
2024-02-09 09:32:04 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-09 10:12:06 UTC Build successful.
2024-02-09 10:12:19 UTC Tests are running...
🔴 2024-02-09 11:51:45 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14735 14552 0 14 129 40

alexnick88
alexnick88 previously approved these changes Feb 9, 2024
Copy link

github-actions bot commented Feb 12, 2024

2024-02-12 09:02:33 UTC Pre-commit check for e1a2465 has started.
2024-02-12 09:02:35 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-12 09:48:20 UTC Build successful.
2024-02-12 09:48:30 UTC Tests are running...
🔴 2024-02-12 11:32:34 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14746 14562 0 17 136 31

Copy link

github-actions bot commented Feb 12, 2024

2024-02-12 09:02:59 UTC Pre-commit check for e1a2465 has started.
2024-02-12 09:03:00 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-12 09:47:15 UTC Build successful.
2024-02-12 09:47:28 UTC Tests are running...
🔴 2024-02-12 11:33:44 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67318 56391 0 2 10892 33

Copy link

github-actions bot commented Feb 13, 2024

2024-02-13 15:49:04 UTC Pre-commit check for cc3434e has started.
2024-02-13 15:49:06 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-13 15:50:15 UTC Build successful.
2024-02-13 15:50:24 UTC Tests are running...
🔴 2024-02-13 15:52:45 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14765 14559 0 25 134 47

Copy link

github-actions bot commented Feb 13, 2024

2024-02-13 15:49:13 UTC Pre-commit check for cc3434e has started.
2024-02-13 15:49:14 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-13 15:50:17 UTC Build successful.
2024-02-13 15:50:26 UTC Tests are running...
🔴 2024-02-13 16:33:49 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67469 56393 0 6 11000 70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants