-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indexer: configure PG work mem for perf #17853
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@gegaowp is attempting to deploy a commit to the Mysten Labs Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
## Description This is to fix slow `objects_snapshot` update via query. in the past, we used to have fewer object mutations and with recent increase on object mutations, the objects_snapshot table starts to fall behind on mainnet. ## Test plan perf comparison with and without a larger mem on a separate cloned DB ``` Insert on objects_snapshot (cost=1787923.79..1810332.86 rows=0 width=0) (actual time=1007038.109..1007038.110 rows=0 loops=1) Conflict Resolution: UPDATE Conflict Arbiter Indexes: objects_snapshot_pkey Tuples Inserted: 0 Conflicting Tuples: 341814 -> Subquery Scan on subquery (cost=1787923.79..1810332.86 rows=3448 width=797) (actual time=15796.923..20093.768 rows=341814 loops=1) Filter: (subquery.rn = 1) -> WindowAgg (cost=1787923.79..1801713.99 rows=689510 width=805) (actual time=15796.921..19698.640 rows=341814 loops=1) Run Condition: (row_number() OVER (?) <= 1) -> Sort (cost=1787923.79..1789647.56 rows=689510 width=797) (actual time=15796.907..17168.209 rows=667901 loops=1) Sort Key: objects_history.object_id, objects_history.object_version DESC Sort Method: external merge Disk: 352584kB -> Index Scan using objects_history_partition_403_checkpoint_sequence_number_ob_idx on objects_history_partition_403 objects_history (cost=0.57..1235565.83 rows=689510 width=797) (actual time=6.255..14802.642 rows=667901 loops=1) Index Cond: ((checkpoint_sequence_number >= 34475349) AND (checkpoint_sequence_number < 34475949)) Planning Time: 0.261 ms Execution Time: 1007093.194 ms ``` with work_mem set to 16GB ``` Insert on objects_snapshot (cost=1093107.47..1111767.31 rows=0 width=0) (actual time=19278.181..19278.184 rows=0 loops=1) Conflict Resolution: UPDATE Conflict Arbiter Indexes: objects_snapshot_pkey Tuples Inserted: 0 Conflicting Tuples: 281860 -> Subquery Scan on subquery (cost=1093107.47..1111767.31 rows=2871 width=797) (actual time=599.297..1225.728 rows=281860 loops=1) Filter: (subquery.rn = 1) -> WindowAgg (cost=1093107.47..1104590.45 rows=574149 width=805) (actual time=599.295..1155.662 rows=281860 loops=1) Run Condition: (row_number() OVER (?) <= 1) -> Sort (cost=1093107.47..1094542.84 rows=574149 width=797) (actual time=599.257..783.334 rows=564334 loops=1) Sort Key: objects_history.object_id, objects_history.object_version DESC Sort Method: quicksort Memory: 342007kB -> Index Scan using objects_history_partition_403_checkpoint_sequence_number_ob_idx on objects_history_partition_403 objects_history (cost=0.57..1038187.06 rows=574149 width=797) (actual time=0.026..120.766 rows=564334 loops=1) Index Cond: ((checkpoint_sequence_number >= 34476549) AND (checkpoint_sequence_number < 34477149)) Planning Time: 3.929 ms Execution Time: 19297.406 ms ``` local run to make sure that the objects_snapshot can be updated with new codes --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK:
## Description This is to fix slow `objects_snapshot` update via query. in the past, we used to have fewer object mutations and with recent increase on object mutations, the objects_snapshot table starts to fall behind on mainnet. ## Test plan perf comparison with and without a larger mem on a separate cloned DB ``` Insert on objects_snapshot (cost=1787923.79..1810332.86 rows=0 width=0) (actual time=1007038.109..1007038.110 rows=0 loops=1) Conflict Resolution: UPDATE Conflict Arbiter Indexes: objects_snapshot_pkey Tuples Inserted: 0 Conflicting Tuples: 341814 -> Subquery Scan on subquery (cost=1787923.79..1810332.86 rows=3448 width=797) (actual time=15796.923..20093.768 rows=341814 loops=1) Filter: (subquery.rn = 1) -> WindowAgg (cost=1787923.79..1801713.99 rows=689510 width=805) (actual time=15796.921..19698.640 rows=341814 loops=1) Run Condition: (row_number() OVER (?) <= 1) -> Sort (cost=1787923.79..1789647.56 rows=689510 width=797) (actual time=15796.907..17168.209 rows=667901 loops=1) Sort Key: objects_history.object_id, objects_history.object_version DESC Sort Method: external merge Disk: 352584kB -> Index Scan using objects_history_partition_403_checkpoint_sequence_number_ob_idx on objects_history_partition_403 objects_history (cost=0.57..1235565.83 rows=689510 width=797) (actual time=6.255..14802.642 rows=667901 loops=1) Index Cond: ((checkpoint_sequence_number >= 34475349) AND (checkpoint_sequence_number < 34475949)) Planning Time: 0.261 ms Execution Time: 1007093.194 ms ``` with work_mem set to 16GB ``` Insert on objects_snapshot (cost=1093107.47..1111767.31 rows=0 width=0) (actual time=19278.181..19278.184 rows=0 loops=1) Conflict Resolution: UPDATE Conflict Arbiter Indexes: objects_snapshot_pkey Tuples Inserted: 0 Conflicting Tuples: 281860 -> Subquery Scan on subquery (cost=1093107.47..1111767.31 rows=2871 width=797) (actual time=599.297..1225.728 rows=281860 loops=1) Filter: (subquery.rn = 1) -> WindowAgg (cost=1093107.47..1104590.45 rows=574149 width=805) (actual time=599.295..1155.662 rows=281860 loops=1) Run Condition: (row_number() OVER (?) <= 1) -> Sort (cost=1093107.47..1094542.84 rows=574149 width=797) (actual time=599.257..783.334 rows=564334 loops=1) Sort Key: objects_history.object_id, objects_history.object_version DESC Sort Method: quicksort Memory: 342007kB -> Index Scan using objects_history_partition_403_checkpoint_sequence_number_ob_idx on objects_history_partition_403 objects_history (cost=0.57..1038187.06 rows=574149 width=797) (actual time=0.026..120.766 rows=564334 loops=1) Index Cond: ((checkpoint_sequence_number >= 34476549) AND (checkpoint_sequence_number < 34477149)) Planning Time: 3.929 ms Execution Time: 19297.406 ms ``` local run to make sure that the objects_snapshot can be updated with new codes --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK:
## Description see the original PR ## Test plan see test plan of the original pr --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK:
.unwrap(); | ||
let pg_work_mem_query_string = format!("SET work_mem = '{}GB'", work_mem_gb); | ||
let pg_work_mem_query = pg_work_mem_query_string.as_str(); | ||
transactional_blocking_with_retry!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think this actually works since we're setting work_mem in a separate transaction? only asking because im looking at the most recent objects_snapshot
update and it's been running for more than 20 minutes now
Description
This is to fix slow
objects_snapshot
update via query.in the past, we used to have fewer object mutations and with recent increase on object mutations, the objects_snapshot table starts to fall behind on mainnet.
Test plan
perf comparison with and without a larger mem on a separate cloned DB
with work_mem set to 16GB
local run to make sure that the objects_snapshot can be updated with new codes
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.