Skip to content

Conversation

@flemzord
Copy link
Member

@flemzord flemzord commented Nov 10, 2025

🎯 Objective

Optimize database indexes to drastically improve query performance for Point-in-Time (PIT) queries, balance lookups, and historical metadata queries.

🔍 Problem Analysis

The existing indexes had a mismatch between their structure and actual query patterns:

Issues Identified

  1. Moves table: Indexes used accounts_seq but queries filtered on account_address
  2. Metadata tables: Indexes used accounts_seq/transactions_seq but queries used accounts_address/transactions_id
  3. Redundant index: accounts_volumes_idx duplicated the PRIMARY KEY functionality

This mismatch caused:

  • Full table scans on large datasets
  • Poor performance on PIT queries
  • Slow balance lookups
  • Inefficient metadata history queries

✨ Solution

Migration 41: Optimize Query Indexes

This migration introduces a zero-downtime index optimization strategy:

  1. CREATE new optimized indexes FIRST (with CONCURRENTLY)
  2. DROP old suboptimal indexes AFTER new ones are ready

This ensures continuous query performance during migration execution.

📊 Changes

Moves Table (3 new indexes, 2 dropped)

Dropped

  • moves_post_commit_volumes - used accounts_seq
  • moves_effective_post_commit_volumes - used accounts_seq

Added

  • idx_moves_pit_insertion on (account_address, asset, insertion_date DESC, seq DESC)
    • For PIT queries with insertion_date
  • idx_moves_pit_effective on (account_address, asset, effective_date DESC, seq DESC)
    • For PIT queries with effective_date
  • idx_moves_account_balance on (account_address, effective_date DESC, seq DESC)
    • INCLUDE (asset, post_commit_effective_volumes)
    • For balance filtering queries

Accounts Metadata (1 new, 1 dropped)

Dropped

  • accounts_metadata_revisions - used accounts_seq

Added

  • idx_accounts_metadata_pit on (accounts_address, revision DESC)
    • INCLUDE (metadata, date)
    • For historical metadata lookups

Transactions Metadata (1 new, 1 dropped)

Dropped

  • transactions_metadata_revisions - used transactions_seq

Added

  • idx_transactions_metadata_pit on (transactions_id, revision DESC)
    • INCLUDE (metadata, date)
    • For historical transaction metadata lookups

Accounts Volumes (cleanup)

Dropped

  • accounts_volumes_idx - redundant with PRIMARY KEY (ledger, accounts_address, asset)

🎯 Affected Queries

  • resource_aggregated_balances.go: PIT queries with DISTINCT ON (accounts_address, asset)
  • resource_accounts.go: Balance filtering with effective_date
  • resource_accounts.go: Account expansion volumes queries
  • resource_accounts.go: Historical metadata lookups
  • resource_transactions.go: Historical transaction metadata lookups

🚀 Performance Impact

Expected Improvements

  • 🔥 MAJOR improvement for Point-in-Time (PIT) queries
  • 🔥 MAJOR improvement for balance lookups by account
  • 🔥 MAJOR improvement for historical metadata queries
  • 💾 Minor reduction in storage (redundant index removed)

Before vs After

Query Type Before After
PIT balance lookup Full table scan Index-only scan
Historical metadata Seq scan + filter Index scan
Balance by account Table scan Covered index scan

🔒 Migration Safety

Zero downtime deployment

  • Uses CONCURRENTLY flag to avoid table locks
  • Creates new indexes before dropping old ones
  • No data modification - only index structure changes

Rollback safe

  • Old indexes remain until new ones are fully built
  • Can be safely interrupted and restarted

📁 Files Changed

  • internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql (66 lines)
  • internal/storage/bucket/migrations/41-optimize-query-indexes/notes.yaml (58 lines)

✅ Testing Recommendations

  1. Run migration on staging environment first
  2. Monitor index creation progress with:
    SELECT * FROM pg_stat_progress_create_index;
  3. Verify query plans after migration:
    EXPLAIN ANALYZE SELECT ...
  4. Check index sizes:
    SELECT schemaname, tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid))
    FROM pg_indexes i
    JOIN pg_class c ON c.relname = i.indexname
    WHERE schemaname = 'your_schema';

📝 Notes

  • Migration includes comprehensive documentation in notes.yaml
  • Index naming follows convention: idx_<table>_<purpose>
  • All affected query patterns have been analyzed and documented

@flemzord flemzord requested a review from a team as a code owner November 10, 2025 15:14
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 10, 2025

Walkthrough

This PR introduces five database index optimization migrations targeting the moves, accounts_metadata, and transactions_metadata tables for Point-in-Time queries, followed by a migration that drops legacy indexes. It also refactors a test to use local variable scoping for cleanup.

Changes

Cohort / File(s) Summary
Moves table PIT optimizations
internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql
internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql
internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/up.sql
Creates three new indexes: idx_moves_pit_insertion (insertion_date desc, seq desc), idx_moves_pit_effective (effective_date desc, seq desc), and an index for update_effective_volumes trigger optimization, all covering (accounts_address, asset) columns with CONCURRENTLY support.
Metadata table PIT optimizations
internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql
internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql
Creates idx_accounts_metadata_pit on (accounts_address, revision DESC) with INCLUDE (metadata, date) and idx_transactions_metadata_pit on (transactions_id, revision desc) with INCLUDE (metadata, date) for historical query performance.
Legacy index cleanup
internal/storage/bucket/migrations/46-drop-old-indexes/up.sql
Drops six obsolete indexes: moves_post_commit_volumes, moves_effective_post_commit_volumes, moves_range_dates, accounts_metadata_revisions, transactions_metadata_revisions, and accounts_volumes_idx.
Test resource cleanup
test/e2e/app_lifecycle_test.go
Refactors service restart test to use local variable srv capturing Wait(ctx) result with DeferCleanup for proper resource disposal.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

  • The migrations follow consistent patterns with templated concurrency and clear documentation
  • Index definitions are straightforward CREATE INDEX statements with no complex logic
  • Migration 46 is a simple DROP IF EXISTS cleanup
  • Test change is a minimal refactoring for proper resource scoping

Poem

🐰 Old indexes fade away, their work now done,
New PIT-optimized paths shine bright and lean,
Three moves, two metadata, one cleanup run—
The schema's tuned, more efficient than before!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main objective of the PR: optimizing database indexes for better query performance, which aligns with all the changes made.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the problems, solutions, and detailed changes across multiple database migrations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/optimize-indexes

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 761f072 and ab5ba2f.

⛔ Files ignored due to path filters (7)
  • .github/workflows/benchmark-comparison.yml is excluded by !**/*.yml
  • internal/storage/bucket/migrations/41-optimize-query-indexes/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/46-drop-old-indexes/notes.yaml is excluded by !**/*.yaml
📒 Files selected for processing (7)
  • internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql (1 hunks)
  • internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql (1 hunks)
  • internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql (1 hunks)
  • internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql (1 hunks)
  • internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/up.sql (1 hunks)
  • internal/storage/bucket/migrations/46-drop-old-indexes/up.sql (1 hunks)
  • test/e2e/app_lifecycle_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql
🚧 Files skipped from review as they are similar to previous changes (6)
  • internal/storage/bucket/migrations/46-drop-old-indexes/up.sql
  • test/e2e/app_lifecycle_test.go
  • internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql
  • internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql
  • internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/up.sql
  • internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Tests
  • GitHub Check: Dirty
  • GitHub Check: BenchmarkCompare

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gfyrag gfyrag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrently created indexes must be isolated in a single file.
When processing multi statement queries, PG always create a transaction.
And concurrent indexes are not compatible with transactions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fcf587d and 4e56c84.

⛔ Files ignored due to path filters (5)
  • internal/storage/bucket/migrations/41-optimize-query-indexes/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/45-drop-old-indexes/notes.yaml is excluded by !**/*.yaml
📒 Files selected for processing (4)
  • internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql (1 hunks)
  • internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql (1 hunks)
  • internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql (1 hunks)
  • internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/storage/bucket/migrations/43-optimize-accounts-metadata-index/up.sql
  • internal/storage/bucket/migrations/42-optimize-query-indexes-pit-effective/up.sql
  • internal/storage/bucket/migrations/41-optimize-query-indexes/up.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Tests
  • GitHub Check: Dirty
  • GitHub Check: BenchmarkCompare
🔇 Additional comments (1)
internal/storage/bucket/migrations/44-optimize-transactions-metadata-index/up.sql (1)

1-9: Documentation and schema setup look good.

The search_path configuration is correct, and the comments clearly explain the optimization's purpose and which queries benefit from it.

@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.17%. Comparing base (0870634) to head (ab5ba2f).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1119      +/-   ##
==========================================
+ Coverage   82.06%   82.17%   +0.10%     
==========================================
  Files         190      190              
  Lines        9319     9319              
==========================================
+ Hits         7648     7658      +10     
+ Misses       1219     1211       -8     
+ Partials      452      450       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/performance/pkg/write/charts/src/graphs.ts (1)

59-60: LGTM! Consider extracting dimensions to constants.

The explicit dimension settings ensure consistent chart output across all performance graphs, which is appropriate for benchmark comparisons.

To reduce duplication, consider extracting the dimensions:

+const CHART_WIDTH = 1200;
+const CHART_HEIGHT = 800;
+
 export const exportTPSGraph = async (configuration: {output: string}, result: BenchmarkResult) => {

Then use chart.setWidth(CHART_WIDTH) and chart.setHeight(CHART_HEIGHT) in all three functions.

Also applies to: 113-114, 199-200

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ae84687 and 9b0ceb9.

⛔ Files ignored due to path filters (2)
  • .github/workflows/benchmark-comparison.yml is excluded by !**/*.yml
  • .github/workflows/benchmark.yml is excluded by !**/*.yml
📒 Files selected for processing (1)
  • test/performance/pkg/write/charts/src/graphs.ts (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Tests
  • GitHub Check: Dirty
  • GitHub Check: BenchmarkCompare

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9b0ceb9 and 3f26312.

⛔ Files ignored due to path filters (3)
  • .github/workflows/benchmark-comparison.yml is excluded by !**/*.yml
  • internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/notes.yaml is excluded by !**/*.yaml
  • internal/storage/bucket/migrations/46-drop-old-indexes/notes.yaml is excluded by !**/*.yaml
📒 Files selected for processing (2)
  • internal/storage/bucket/migrations/45-add-moves-update-effective-volumes-index/up.sql (1 hunks)
  • internal/storage/bucket/migrations/46-drop-old-indexes/up.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: gfyrag
Repo: formancehq/ledger PR: 1017
File: internal/storage/bucket/migrations/27-fix-invalid-pcv/up.sql:30-31
Timestamp: 2025-07-19T18:35:34.260Z
Learning: In Formance ledger migrations, SHARE ROW EXCLUSIVE locks on migration-specific temporary tables like moves_view are acceptable since these tables are dedicated for the migration. ACCESS SHARE locks on transactions table don't block writes, only schema changes.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: BenchmarkCompare
  • GitHub Check: Tests
  • GitHub Check: Dirty
🔇 Additional comments (1)
internal/storage/bucket/migrations/46-drop-old-indexes/up.sql (1)

1-19: LGTM — Drop strategy is sound.

The use of IF EXISTS on all DROP statements ensures idempotency. Comments correctly trace which migrations introduced the replacement indexes. Since new indexes are created in prior migrations (41–45), dropping these legacy indexes is safe and incurs no query fallback penalty.

One minor note: ensure that all six old indexes are confirmed no longer referenced in any views, stored procedures, or application code outside this migration suite. Based on the PR objectives, this appears to be verified, but it's worth confirming during testing.

@flemzord flemzord force-pushed the feat/optimize-indexes branch from f487474 to 46aa9d0 Compare November 18, 2025 17:21
gfyrag
gfyrag previously approved these changes Nov 21, 2025
Add migration 41 to optimize database indexes for improved performance:

## Changes

### Moves Table
- **DROP**: moves_post_commit_volumes, moves_effective_post_commit_volumes
  (used accounts_seq instead of account_address)
- **ADD**: idx_moves_pit_insertion for PIT queries with insertion_date
- **ADD**: idx_moves_pit_effective for PIT queries with effective_date
- **ADD**: idx_moves_account_balance for balance lookups

### Accounts Metadata
- **DROP**: accounts_metadata_revisions (used accounts_seq)
- **ADD**: idx_accounts_metadata_pit for historical metadata queries

### Transactions Metadata
- **DROP**: transactions_metadata_revisions (used transactions_seq)
- **ADD**: idx_transactions_metadata_pit for historical metadata queries

### Accounts Volumes
- **DROP**: accounts_volumes_idx (redundant with PRIMARY KEY)

## Performance Impact
- Major improvement for Point-in-Time (PIT) queries
- Major improvement for balance lookups by account
- Major improvement for historical metadata queries
- Reduced storage by removing redundant index

## Affected Query Files
- resource_aggregated_balances.go
- resource_accounts.go
- resource_transactions.go
… degradation

Reorder migration operations to:
1. Create all new optimized indexes first (with CONCURRENTLY)
2. Drop old suboptimal indexes only after new ones are ready

This ensures continuous query performance during migration execution.
Remove ledger and date columns from indexes as requested:
- Moves indexes: removed 'ledger' column prefix
- Accounts metadata index: removed 'ledger' and 'date' columns
- Transactions metadata index: already optimized (ledger/date not included)

This reduces index size while maintaining query performance.
Split the single migration 41 into 5 separate migrations:
- Migration 41: Create idx_moves_pit_insertion index
- Migration 42: Create idx_moves_pit_effective index
- Migration 43: Create idx_accounts_metadata_pit index
- Migration 44: Create idx_transactions_metadata_pit index
- Migration 45: Drop all old indexes

This ensures each concurrent index creation is isolated in its own
migration, and all index drops are consolidated into a final migration.
Also removed the optional idx_moves_account_balance index on Moves table.
Add comprehensive notes.yaml documentation for each migration:
- Migration 41: idx_moves_pit_insertion index creation
- Migration 42: idx_moves_pit_effective index creation
- Migration 43: idx_accounts_metadata_pit index creation
- Migration 44: idx_transactions_metadata_pit index creation
- Migration 45: Drop all old indexes

Each file documents the purpose, changes, affected queries,
performance impact, and safety considerations.
CONCURRENTLY cannot be used within the migration framework as migrations
run within a transactional context (advisory locks). Removed CONCURRENTLY
from all index creation statements to ensure migrations can execute successfully.

The migrations remain isolated (one index per migration) which maintains
good organization and allows for easier rollback if needed. Index creation
will lock tables during execution but ensures compatibility with the
migration system.

Note: Migration 41-44 create new indexes, migration 45 drops old ones.
This approach minimizes risk by creating all new indexes before dropping
old ones.
Add idx_moves_update_effective_volumes index to prevent sequential scans
on every INSERT when MOVES_HISTORY feature is enabled (ON by default).

The update_effective_volumes trigger runs after each move insertion and
updates all moves with effective_date > new.effective_date. Without this
index, this operation causes significant performance degradation at scale.

Changes:
- New migration 45: Create idx_moves_update_effective_volumes
- Migration 45 renamed to 46: Updated to drop moves_range_dates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants