-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kv store] add watermark table to bigtable #20390
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 3 Skipped Deployments
|
Some((key_bytes, _)) => Ok(u64::from_be_bytes(key_bytes.as_slice().try_into()?)), | ||
None => Ok(0), | ||
} | ||
self.get_watermark(AGGREGATED_WATERMARK_NAME).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what "AGGREGATED_WATERMARK_NAME" is and why we wouldn't use the checkpoint task name? Or do we just have 1 "task" that is responsible for all the bigtable writes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we have a single task for all BigTable writes.
On the sui-data-ingestion side, DDB will be eventually replaced with BigTable, so our instance will store multiple entries for different workflows (e.g., archival, blob storage, etc.).
For community deployments, however, there will be only a single entry
This reverts commit 7eadb13.
This reverts commit 7eadb13. ## Description The hot row-hot cell pattern significantly slows down writes to Bigtable, impacting the ingestion speed of core pipelines --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
Description
The PR adds a watermark table to BigTable. The internal workflow progress store now writes watermarks to both DynamoDB and BigTable. In the future, the DynamoDB progress store will be deprecated
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.