Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[yugabyte#10935] CDCSDK: Provide tablet split support to CDCSDK Service
Summary: **Adding entries to the cdc_state table for the children tablets** After the split has been registered, but before the split requests are sent, as part of the TabletSplitManager::ProcessSplitTabletResult call, in order to start processing the metadata for CDCSDK . For each stream which has an entry with the parent tablet we will add entries for the child tablets with checkpoint 0.0 **Retain Parent tablet until GetChanges catches up with SplitOp** While any tablet is being deleted after tablet split, we will first see if there is any active CDCSDK stream associated with the table of the tablet, using the newly introduced map: cdcsdk_tables_to_stream_map_. If there indeed is an active stream associated with the table, will add the tablet to a new map: retained_by_cdcsdk_(std::unordered_map<TabletId, HiddenReplicationParentTabletInfo>) , which tells the hidden tablet is still needed for cdcsdk. We will mark any tablet which has an active CDCSDK stream (this tablet is part of retained_by_cdcsdk_) as hidden. This way the tablet will not get deleted until we have reported all the required changes to the CDC client. This will be done as part of processing Split in the function:CatalogManager::DeleteTabletListAndSendRequests. This approach is similar to the one followed the xcluster (change introduced by PR: yugabyte@288330d ) The cdcsdk_tables_to_stream_map_ and retained_by_cdcsdk_ maps are in-memory data structures, but they will be re-initialized on master restarts. **Detect Tablet SplitOp from GetChanges** In the function: GetChangesForCDCSDK , when we read the ops to operate on from the function: ReadReplicatedMessagesForCDC , we will know that tablet split occurred when we a an op of type: yb::consensus::OperationType::SPLIT_OP. The list of things to be done when we detect a split op is: First confirm if tablet split has indeed occurred Validate there are no more write ops/ update ops on the tablet Do not update the checkpoint which is sent as a response to the client. So that the next GetChanges call will be with the OpId which is from the last operation just before the SplitOp. Please note that we will not be communicating to the client that the tablet split occurred in this call itself, but rather send the data until the SplitOp to the client. The client will get to know about the Split in their next ‘GetChanges’ call. **Communicating Tablet Split to Client** The client will next call GetChanges again with the fromOpId of the operation just before the SplitOp. So if the first message we read from ‘ReadReplicatedMessagesForCDC’ is the SplitOp message, we will know that the client already has all the data preceding the Split. In such cases: we will update the checkpoint for the tablet , with the OpId of the SplitOp Verify entry exists for children tablets, if not add entry for the children tablets with checkpoint equal to the SplitOpId to the ‘cdc_state’ table Remove entry of the tablet/ stream pair from ‘cdc_state’ table return an error code: TABLET_SPLIT. **Dealing with GetChanges on the parent tablet after TabletSplit** If the client calls “GetChanges” on a tablet which is not found, we will call “GetTablets”: if we see that the call is on any parent tablet, we will return a TABLET_SPLIT error. If the tabletId is neither a parent nor an active tablet returned in through “GetTablets”, we will return a TABLET_NOT_FOUND error. **Deletion of parent tablet** There is a background thread which handles tablet deletion. We will add a new function: DoProcessCDCSDKClusterTabletDeletion , which will check if there is at least one entry for the parent tablet in the ‘cdc_state’ table , with checkpoint not (-1.-1). If we did not find a single row which satisfies the criteria: we will remove the tablet’s entry from ‘retained_by_cdcsdk_’. We will also remove the rows for this tablet from the ‘cdc_state’ table Update the child’s checkpoint as -1.-1 (If this is not done, any child tablet on which we have not started streaming will be unnecessarily retaining data) And the function ‘CleanupHiddenTablets’ will later delete the tablet, since it is removed from ‘retained_by_cdcsdk_’. We will be re-using the bg thread which handles the deletion of tablets retained by xcluster (introduced in PR: yugabyte@288330d) , now this bg task will run two functions: DoProcessCDCSDKClusterTabletDeletion , along with the existing DoProcessXClusterParentTabletDeletion **Dealing with client side restarts before TabletSplit is communicated** In cases where the cdcsdk client crashes/ restarts before the tablet split is communicated to it, which means there can still be data which needs to be streamed from the parent tablet, if the client calls “GetTablets” it will not return the parent tablet Id. To support this scenario we will introduce a new API: “GetTabletListToPollForCdc” This API will take the table name and stream id as request parameters , and return all the tablets and their checkpoints by scanning the ‘cdc_state’ table. The list of things the new API “GetTabletListToPollForCdc” API will do is: Call “GetTablets” for the relevant tabletId, and derive the set of parent tablets and child tablets from the results of this API (The api returns data to get the parent tablet id of the children tablets) Now we start iterating the rows belonging to the required stream from the ‘cdc_state’table, and filter down only to the relevant rows. We decide to add any tablet’s info to the final result based on the below conditions: # Add every tablet which is not a child/ parent tablet to the result # If the tablet is not a parent tablet, nor is it an active tablet returned by ‘GetTabelts’ call, then we do not add the tablet to the result. This happens in scenarios where the tablet split has been initiated but not completed. # If the tablet is not a child tablet, but a parent tablet, we will add the paren tablet to the result if we have not started polling on any of the children or if we have still not reported the tablet split to the client (when the tablet split is reported to the client, the child’s checkpoint will be changed to the SplitOp record’s OpId) # If the tablet is a child and we have started polling on the tablet, we will add the child to the result. # If the tablet is a child and if we have not started polling on the tablet, we check if the parent tablet has already been polled on, if so we do not add the current child tablet to the result, if the parent tablet has not been polled, we will add the child tablet to the result Test Plan: Added ctests Reviewers: skumar, srangavajjula, aagrawal, rvenkatesh, jhe, sdash, vkushwaha Reviewed By: jhe, sdash, vkushwaha Subscribers: bogdan Differential Revision: http://phabricator.dev.yugabyte.com/D18638
- Loading branch information