-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indexer-alt: reader watermark task #20216
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
// Pipelines that are split up into a summary table, and a write-ahead log, where the | ||
// write-ahead log needs to be pruned. | ||
let pruner_config = lag.map(|l| PrunerConfig { | ||
interval: consistent_pruning_interval, | ||
// Retain at least twice as much data as the lag, to guarantee overlap between the | ||
// summary table and the write-ahead log. | ||
retention: l * 2, | ||
// Prune roughly five minutes of data in one go. | ||
max_chunk_size: 5 * 300, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
digging the comments
// Unpruned concurrent pipelines | ||
indexer.concurrent_pipeline(EvEmitMod, None).await?; | ||
indexer.concurrent_pipeline(EvStructInst, None).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we plan to enable pruning on these tables later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will, but that needs some more work on configuration support.
let new_reader_lo = (current.checkpoint_hi_inclusive as u64 + 1) | ||
.saturating_sub(config.retention); | ||
|
||
if new_reader_lo <= current.reader_lo as u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be pretty rare that this occurs right, the new_reader_lo < current.reader_lo
is unlikely to happen, except in the case where the pruning config is made more lenient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main reason for this check is in case there are multiple instances of the indexer running at the same time.
/// Update the reader low watermark for an existing watermark row, as long as this raises the | ||
/// watermark, and updates the timestamp this update happened to the database's current time. | ||
/// | ||
/// Returns a boolean indicating whether the watermark was actually updated or not. | ||
pub async fn update(&self, conn: &mut Connection<'_>) -> QueryResult<bool> { | ||
Ok(diesel::update(watermarks::table) | ||
.set((self, watermarks::pruner_timestamp.eq(diesel::dsl::now))) | ||
.filter(watermarks::pipeline.eq(&self.pipeline)) | ||
.filter(watermarks::reader_lo.lt(self.reader_lo)) | ||
.execute(conn) | ||
.await? | ||
> 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see, so the reader_lo tasks will update on an interval, but whether the writes actually go through occur on the db level, with the filtering conditions here. Neat, and simplifies the task even further
e675a8a
to
20a3533
Compare
66182c4
to
dce1c0e
Compare
## Description Change `watermark.pruner_timestamp_ms BIGINT` to `watermark.pruner_timestamp TIMESTAMP`. This is because unlike the other timestamps in this table, which are read from checkpoint data, this timestamp is read from the database, and so it is easiest to represent it using the database's native type for timestamps. ## Test plan CI
## Description The reader watermark task is one half of the pruning implementation. It applies only to concurrent pipelines, and tracks the lowest checkpoint sequence number the reader is guaranteed to get data for, and the timestamp (from the DB), that this threshold was set at. This task does not do any actual pruning, but it sets the limit that the pruner can go to, and a time at which it can start pruning to that point. A later PR will introduce the actual pruning task. Unlike the existing pruner implementation, this system does not support pruning by epoch, only by checkpoint. This was for two reasons: - Pruning no longer works by dropping partitions, so we can keep the implementation simple by only supporting the form of pruning we need (by checkpoint sequence number). - Pruning is part of the indexing framework, rather than a specific part of our indexer implementation, so pruning by epoch would require that the indexing framework kept track of a mapping from epoch to checkpoint sequence number. Along with the reader watermark, the following changes were made: - The existing watermark task was renamed to the `committer_watermark` to distinguish it. - The interface between pipelines and the indexing framework has been simplified to return just one `JoinHandle`. This was because clippy complained that returning 6 handles was too complex, and additionally, because the reader watermark task is not connected to the rest of the tasks for a pipeline, it needs a more complicated tear down logic, where firstly the committer tasks are wound down, and then the pruner is explicitly cancelled, before its tasks are awaited as well. The change also sets up pruning configs for write-ahead log tables, so that we can test reader watermark behaviour. ## Test plan Run the indexer with just one of the consistent read summary and write-ahead logs, and then query the watermarks table: ``` cargo run -p sui-indexer-alt --release -- \ --database-url "postgres://postgres:postgrespw@localhost:5432/sui_indexer_alt" \ indexer --remote-store-url https://checkpoints.mainnet.sui.io \ --last-checkpoint 10000 \ --consistent-range 100 --consistent-pruning-interval 10 \ --pipeline sum_obj_types --pipeline wal_obj_types ``` ``` sui_indexer_alt=# SELECT * FROM watermarks; pipeline | epoch_hi_inclusive | checkpoint_hi_inclusive | tx_hi | timestamp_ms_hi_inclusive | epoch_lo | reader_lo | pruner_timestamp | pruner_hi ---------------+--------------------+-------------------------+-------+---------------------------+----------+-----------+----------------------------+----------- sum_obj_types | 1 | 9900 | 9902 | 1681405380769 | 0 | 0 | 1970-01-01 00:00:00 | 0 wal_obj_types | 1 | 10000 | 10002 | 1681405504878 | 0 | 6321 | 2024-11-10 17:33:11.459593 | 0 (2 rows) ``` Note that `reader_lo` is not guaranteed to be as advanced as it could be, because this task checks in relatively infrequently, and the indexer as a whole might shut down efore it has a chance to catch up. However, on restart, the indexer should fast-forward the watermark (i.e. if the steps above are repeated, `reader_lo` will be updated): ``` sui_indexer_alt=# SELECT * FROM watermarks; pipeline | epoch_hi_inclusive | checkpoint_hi_inclusive | tx_hi | timestamp_ms_hi_inclusive | epoch_lo | reader_lo | pruner_timestamp | pruner_hi ---------------+--------------------+-------------------------+-------+---------------------------+----------+-----------+----------------------------+----------- sum_obj_types | 1 | 9900 | 9902 | 1681405380769 | 0 | 0 | 1970-01-01 00:00:00 | 0 wal_obj_types | 1 | 10000 | 10002 | 1681405504878 | 0 | 9801 | 2024-11-10 17:33:49.913129 | 0 (2 rows) ```
dce1c0e
to
7513af9
Compare
## Description This field turns out not to be used by the new pruner implementation, because it is entirely based on checkpoints. ## Test plan CI ## Stack - #20149 - #20150 - #20166 - #20216 - #20217 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
## Description This field turns out not to be used by the new pruner implementation, because it is entirely based on checkpoints. ## Test plan CI ## Stack - #20149 - #20150 - #20166 - #20216 - #20217 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
Description
The reader watermark task is one half of the pruning implementation. It applies only to concurrent pipelines, and tracks the lowest checkpoint sequence number the reader is guaranteed to get data for, and the timestamp (from the DB), that this threshold was set at.
This task does not do any actual pruning, but it sets the limit that the pruner can go to, and a time at which it can start pruning to that point. A later PR will introduce the actual pruning task.
Unlike the existing pruner implementation, this system does not support pruning by epoch, only by checkpoint. This was for two reasons:
Along with the reader watermark, the following changes were made:
committer_watermark
to distinguish it.JoinHandle
. This was because clippy complained that returning 6 handles was too complex, and additionally, because the reader watermark task is not connected to the rest of the tasks for a pipeline, it needs a more complicated tear down logic, where firstly the committer tasks are wound down, and then the pruner is explicitly cancelled, before its tasks are awaited as well.watermarks.pruner_timestamp_ms: BIGINT
was turned intowatermarks.pruner_timestamp: TIMESTAMP
because we always read/write it from the database'sNOW(): TIMESTAMP
function.The change also sets up pruning configs for write-ahead log tables, so that we can test reader watermark behaviour.
Test plan
Run the indexer with just one of the consistent read summary and write-ahead logs, and then query the watermarks table:
Note that
reader_lo
is not guaranteed to be as advanced as it could be, because this task checks in relatively infrequently, and the indexer as a whole might shut down efore it has a chance to catch up. However, on restart, the indexer should fast-forward the watermark (i.e. if the steps above are repeated,reader_lo
will be updated):Stack
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.