-
Notifications
You must be signed in to change notification settings - Fork 134
feat: optimize database indexes for improved query performance #1119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis PR introduces five database index optimization migrations targeting the moves, accounts_metadata, and transactions_metadata tables for Point-in-Time queries, followed by a migration that drops legacy indexes. It also refactors a test to use local variable scoping for cleanup. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Disabled knowledge base sources:
⛔ Files ignored due to path filters (7)
📒 Files selected for processing (7)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (6)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Comment |
gfyrag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concurrently created indexes must be isolated in a single file.
When processing multi statement queries, PG always create a transaction.
And concurrent indexes are not compatible with transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (5)
internal/storage/bucket/migrations/41-optimize-query-indexes/notes.yamlis excluded by!**/*.yamlinternal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/notes.yamlis excluded by!**/*.yamlinternal/storage/bucket/migrations/43-optimize-accounts-metadata-index/notes.yamlis excluded by!**/*.yamlinternal/storage/bucket/migrations/44-optimize-transactions-metadata-index/notes.yamlis excluded by!**/*.yamlinternal/storage/bucket/migrations/45-drop-old-indexes/notes.yamlis excluded by!**/*.yaml
📒 Files selected for processing (4)
internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql(1 hunks)internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql(1 hunks)internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql(1 hunks)internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql
- internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql
- internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Tests
- GitHub Check: Dirty
- GitHub Check: BenchmarkCompare
🔇 Additional comments (1)
internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql (1)
1-9: Documentation and schema setup look good.The search_path configuration is correct, and the comments clearly explain the optimization's purpose and which queries benefit from it.
internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql
Outdated
Show resolved
Hide resolved
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1119 +/- ##
==========================================
+ Coverage 82.06% 82.17% +0.10%
==========================================
Files 190 190
Lines 9319 9319
==========================================
+ Hits 7648 7658 +10
+ Misses 1219 1211 -8
+ Partials 452 450 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/performance/pkg/write/charts/src/graphs.ts (1)
59-60: LGTM! Consider extracting dimensions to constants.The explicit dimension settings ensure consistent chart output across all performance graphs, which is appropriate for benchmark comparisons.
To reduce duplication, consider extracting the dimensions:
+const CHART_WIDTH = 1200; +const CHART_HEIGHT = 800; + export const exportTPSGraph = async (configuration: {output: string}, result: BenchmarkResult) => {Then use
chart.setWidth(CHART_WIDTH)andchart.setHeight(CHART_HEIGHT)in all three functions.Also applies to: 113-114, 199-200
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
.github/workflows/benchmark-comparison.ymlis excluded by!**/*.yml.github/workflows/benchmark.ymlis excluded by!**/*.yml
📒 Files selected for processing (1)
test/performance/pkg/write/charts/src/graphs.ts(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Tests
- GitHub Check: Dirty
- GitHub Check: BenchmarkCompare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (3)
.github/workflows/benchmark-comparison.ymlis excluded by!**/*.ymlinternal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/notes.yamlis excluded by!**/*.yamlinternal/storage/bucket/migrations/46-drop-old-indexes/notes.yamlis excluded by!**/*.yaml
📒 Files selected for processing (2)
internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/up.sql(1 hunks)internal/storage/bucket/migrations/46-drop-old-indexes/up.sql(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: gfyrag
Repo: formancehq/ledger PR: 1017
File: internal/storage/bucket/migrations/27-fix-invalid-pcv/up.sql:30-31
Timestamp: 2025-07-19T18:35:34.260Z
Learning: In Formance ledger migrations, SHARE ROW EXCLUSIVE locks on migration-specific temporary tables like moves_view are acceptable since these tables are dedicated for the migration. ACCESS SHARE locks on transactions table don't block writes, only schema changes.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: BenchmarkCompare
- GitHub Check: Tests
- GitHub Check: Dirty
🔇 Additional comments (1)
internal/storage/bucket/migrations/46-drop-old-indexes/up.sql (1)
1-19: LGTM — Drop strategy is sound.The use of
IF EXISTSon all DROP statements ensures idempotency. Comments correctly trace which migrations introduced the replacement indexes. Since new indexes are created in prior migrations (41–45), dropping these legacy indexes is safe and incurs no query fallback penalty.One minor note: ensure that all six old indexes are confirmed no longer referenced in any views, stored procedures, or application code outside this migration suite. Based on the PR objectives, this appears to be verified, but it's worth confirming during testing.
internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/up.sql
Outdated
Show resolved
Hide resolved
f487474 to
46aa9d0
Compare
Add migration 41 to optimize database indexes for improved performance: ## Changes ### Moves Table - **DROP**: moves_post_commit_volumes, moves_effective_post_commit_volumes (used accounts_seq instead of account_address) - **ADD**: idx_moves_pit_insertion for PIT queries with insertion_date - **ADD**: idx_moves_pit_effective for PIT queries with effective_date - **ADD**: idx_moves_account_balance for balance lookups ### Accounts Metadata - **DROP**: accounts_metadata_revisions (used accounts_seq) - **ADD**: idx_accounts_metadata_pit for historical metadata queries ### Transactions Metadata - **DROP**: transactions_metadata_revisions (used transactions_seq) - **ADD**: idx_transactions_metadata_pit for historical metadata queries ### Accounts Volumes - **DROP**: accounts_volumes_idx (redundant with PRIMARY KEY) ## Performance Impact - Major improvement for Point-in-Time (PIT) queries - Major improvement for balance lookups by account - Major improvement for historical metadata queries - Reduced storage by removing redundant index ## Affected Query Files - resource_aggregated_balances.go - resource_accounts.go - resource_transactions.go
… degradation Reorder migration operations to: 1. Create all new optimized indexes first (with CONCURRENTLY) 2. Drop old suboptimal indexes only after new ones are ready This ensures continuous query performance during migration execution.
Remove ledger and date columns from indexes as requested: - Moves indexes: removed 'ledger' column prefix - Accounts metadata index: removed 'ledger' and 'date' columns - Transactions metadata index: already optimized (ledger/date not included) This reduces index size while maintaining query performance.
Split the single migration 41 into 5 separate migrations: - Migration 41: Create idx_moves_pit_insertion index - Migration 42: Create idx_moves_pit_effective index - Migration 43: Create idx_accounts_metadata_pit index - Migration 44: Create idx_transactions_metadata_pit index - Migration 45: Drop all old indexes This ensures each concurrent index creation is isolated in its own migration, and all index drops are consolidated into a final migration. Also removed the optional idx_moves_account_balance index on Moves table.
Add comprehensive notes.yaml documentation for each migration: - Migration 41: idx_moves_pit_insertion index creation - Migration 42: idx_moves_pit_effective index creation - Migration 43: idx_accounts_metadata_pit index creation - Migration 44: idx_transactions_metadata_pit index creation - Migration 45: Drop all old indexes Each file documents the purpose, changes, affected queries, performance impact, and safety considerations.
CONCURRENTLY cannot be used within the migration framework as migrations run within a transactional context (advisory locks). Removed CONCURRENTLY from all index creation statements to ensure migrations can execute successfully. The migrations remain isolated (one index per migration) which maintains good organization and allows for easier rollback if needed. Index creation will lock tables during execution but ensures compatibility with the migration system. Note: Migration 41-44 create new indexes, migration 45 drops old ones. This approach minimizes risk by creating all new indexes before dropping old ones.
Add idx_moves_update_effective_volumes index to prevent sequential scans on every INSERT when MOVES_HISTORY feature is enabled (ON by default). The update_effective_volumes trigger runs after each move insertion and updates all moves with effective_date > new.effective_date. Without this index, this operation causes significant performance degradation at scale. Changes: - New migration 45: Create idx_moves_update_effective_volumes - Migration 45 renamed to 46: Updated to drop moves_range_dates
761f072 to
ab5ba2f
Compare
🎯 Objective
Optimize database indexes to drastically improve query performance for Point-in-Time (PIT) queries, balance lookups, and historical metadata queries.
🔍 Problem Analysis
The existing indexes had a mismatch between their structure and actual query patterns:
Issues Identified
accounts_seqbut queries filtered onaccount_addressaccounts_seq/transactions_seqbut queries usedaccounts_address/transactions_idaccounts_volumes_idxduplicated the PRIMARY KEY functionalityThis mismatch caused:
✨ Solution
Migration 41: Optimize Query Indexes
This migration introduces a zero-downtime index optimization strategy:
CONCURRENTLY)This ensures continuous query performance during migration execution.
📊 Changes
Moves Table (3 new indexes, 2 dropped)
Dropped
moves_post_commit_volumes- usedaccounts_seqmoves_effective_post_commit_volumes- usedaccounts_seqAdded
idx_moves_pit_insertionon(account_address, asset, insertion_date DESC, seq DESC)idx_moves_pit_effectiveon(account_address, asset, effective_date DESC, seq DESC)idx_moves_account_balanceon(account_address, effective_date DESC, seq DESC)(asset, post_commit_effective_volumes)Accounts Metadata (1 new, 1 dropped)
Dropped
accounts_metadata_revisions- usedaccounts_seqAdded
idx_accounts_metadata_piton(accounts_address, revision DESC)(metadata, date)Transactions Metadata (1 new, 1 dropped)
Dropped
transactions_metadata_revisions- usedtransactions_seqAdded
idx_transactions_metadata_piton(transactions_id, revision DESC)(metadata, date)Accounts Volumes (cleanup)
Dropped
accounts_volumes_idx- redundant with PRIMARY KEY(ledger, accounts_address, asset)🎯 Affected Queries
resource_aggregated_balances.go: PIT queries withDISTINCT ON (accounts_address, asset)resource_accounts.go: Balance filtering witheffective_dateresource_accounts.go: Account expansion volumes queriesresource_accounts.go: Historical metadata lookupsresource_transactions.go: Historical transaction metadata lookups🚀 Performance Impact
Expected Improvements
Before vs After
🔒 Migration Safety
✅ Zero downtime deployment
CONCURRENTLYflag to avoid table locks✅ Rollback safe
📁 Files Changed
internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql(66 lines)internal/storage/bucket/migrations/41-optimize-query-indexes/notes.yaml(58 lines)✅ Testing Recommendations
EXPLAIN ANALYZE SELECT ...📝 Notes
notes.yamlidx_<table>_<purpose>