payments: SQL backend implementation series#10604
Conversation
Summary of ChangesHello @ziggie1984, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request consolidates and integrates a complete SQL backend for LND's payments database. This significant architectural change enables the use of relational databases like SQLite and Postgres for storing payment states, moving beyond the previous bbolt key-value store. The work encompasses a new SQL schema, a full suite of CRUD operations, context threading throughout payment lifecycle functions, and a refactored, database-agnostic testing framework. This PR represents the culmination of several incrementally reviewed and merged sub-PRs, bringing a more robust and flexible payment storage solution to the main branch. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a comprehensive SQL backend for the payments database in lnd, supporting both SQLite and Postgres. The implementation is well-structured, follows the established patterns in the codebase, and includes thorough documentation for the new SQL-specific functions. The transition to context-aware database operations is a significant improvement for resource management. The implementation generally adheres to the LND Style Guide, particularly regarding function documentation and code segmentation. However, there are a few areas where performance and consistency can be improved, particularly regarding timestamp handling, non-deterministic result ordering in some queries, and the use of context.TODO() in the routing layer.
| Value: lnwire.MilliSatoshi(amountMsat), | ||
| // The creation time is stored in the database as UTC but here | ||
| // we convert it to local time. | ||
| CreationTime: createdAt.Local(), |
There was a problem hiding this comment.
The use of .Local() here converts the timestamp to the node's local timezone. Since lnd generally uses UTC internally for all database operations and logic (e.g., expiry calculations), this could lead to inconsistent behavior if the node's local timezone is not UTC. It's better to keep the time in UTC as it is stored in the database.
| CreationTime: createdAt.Local(), | |
| CreationTime: createdAt, |
There was a problem hiding this comment.
Thanks for the review! However, I think this is intentional and the suggested change would actually break tests.
The .Local() call is needed for consistency with the KV backend. In codec.go, deserializeTime returns time.Unix(0, int64(unixNano)), and Go's time.Unix sets the Location to time.Local by default. The SQL converter's .Local() makes the two backends return time.Time values with the same Location field.
More concretely, in payment_test.go the test constructs CreationTime as:
CreationTime: time.Unix(time.Now().Unix(), 0),which also returns local time. The test then compares via reflect.DeepEqual(payment.Info, c) (line 247), which checks the Location field of time.Time. If we removed .Local(), the SQL backend would return UTC while the test expects local time, causing this DeepEqual check to fail.
As for the actual usage of CreationTime in the codebase — it's only ever consumed via .Unix() and .UnixNano() in router_backend.go, both of which are timezone-agnostic. So this is purely about test consistency.
payments/db/sql_store.go
Outdated
|
|
||
| err = db.SettleAttempt(ctx, sqlc.SettleAttemptParams{ | ||
| AttemptIndex: int64(attemptID), | ||
| ResolutionTime: time.Now(), |
There was a problem hiding this comment.
Using time.Now() here ignores the SettleTime provided in the settleInfo parameter. This timestamp is often generated using a mockable clock in tests or represents the actual time the event occurred in the payment lifecycle. Hardcoding time.Now() makes the database record inconsistent with the event time and can break time-sensitive tests.
| ResolutionTime: time.Now(), | |
| ResolutionTime: settleInfo.SettleTime, |
There was a problem hiding this comment.
Fixed — ResolutionTime now uses settleInfo.SettleTime instead of time.Now(). A test assertion was also added to assertPaymentInfo to verify the settle time is correctly stored and retrieved, with explicit timestamps passed at the call sites.
payments/db/sql_store.go
Outdated
|
|
||
| err = db.FailAttempt(ctx, sqlc.FailAttemptParams{ | ||
| AttemptIndex: int64(attemptID), | ||
| ResolutionTime: time.Now(), |
There was a problem hiding this comment.
Using time.Now() here ignores the FailTime provided in the failInfo parameter. Similar to the settlement case, this timestamp should reflect the actual time the failure occurred as determined by the caller, ensuring consistency in the database and supporting deterministic testing.
| ResolutionTime: time.Now(), | |
| ResolutionTime: failInfo.FailTime, |
There was a problem hiding this comment.
Fixed — ResolutionTime now uses failInfo.FailTime instead of time.Now(). Same as the settle case, a test assertion was added to verify the fail time is correctly stored and retrieved.
| mpPayments = make([]*MPPayment, 0, len(processedPayments)) | ||
| for _, payment := range processedPayments { | ||
| mpPayments = append(mpPayments, payment) | ||
| } | ||
|
|
There was a problem hiding this comment.
The returned slice of payments is constructed by iterating over a map, which results in a non-deterministic order. The KV backend implementation returns payments ordered by their hash (due to bbolt's ForEach order). To maintain consistency and provide a predictable API, the results should be sorted, for example by their sequence number.
There was a problem hiding this comment.
Acknowledged — the map iteration order is non-deterministic. A sort.Slice by SequenceNum has been added after the map-to-slice conversion to produce a deterministic, insertion-ordered result.
| if mpPayment.Status != StatusSucceeded && | ||
| !query.IncludeIncomplete { | ||
|
|
||
| return nil | ||
| } | ||
|
|
There was a problem hiding this comment.
When IncludeIncomplete is false, the code still fetches the complete payment data (including all HTLCs and hops) for every payment in the range, only to skip it in Go code if it hasn't succeeded. This is inefficient for large databases. This filtering should be performed in the SQL query itself, for example by checking for the existence of a settled HTLC resolution.
There was a problem hiding this comment.
Thanks for the suggestion! However, filtering at the SQL level is not straightforward here by design. The payment status is not stored as a column in the payments table — it is computed in Go by examining the HTLC attempt resolutions. There is no status field the SQL query can filter on directly.
To push this filtering into SQL would require either adding a denormalised status column (and keeping it in sync on every state transition), or a subquery joining across htlc_attempts and htlc_attempt_resolutions to infer status. Both options are significant complexity for a performance optimisation. We'll keep the current approach and defer any optimisation to a follow-up if it becomes a bottleneck.
| func (r *ChannelRouter) PreparePayment(payment *LightningPayment) ( | ||
| PaymentSession, shards.ShardTracker, error) { | ||
|
|
||
| ctx := context.TODO() |
There was a problem hiding this comment.
The use of context.TODO() here and in other methods within ChannelRouter prevents proper cancellation and resource management. Since the payment lifecycle functions have been updated to accept a context, it's an improvement opportunity to thread the context from the caller (e.g., from the RPC layer) down to these methods.
There was a problem hiding this comment.
This is out of scope for this PR. PreparePayment does not accept a context parameter, so threading one through would require changing the interface and all callers — a separate refactor. We already thread context through all the payment DB methods introduced in this PR; propagating it further up the call stack can be addressed in a follow-up.
| Scid: strconv.FormatUint( | ||
| hop.ChannelID, 10, | ||
| ), |
There was a problem hiding this comment.
Storing the Short Channel ID (SCID) as a TEXT field and performing string conversions (strconv.FormatUint) on every write is less efficient than using a BIGINT (uint64). Since SCIDs are fundamentally 64-bit integers, using the appropriate numeric type in the schema would improve performance and reduce storage requirements.
There was a problem hiding this comment.
The TEXT type for SCIDs is intentional and consistent with the existing invoice table design, which also stores chan_id as TEXT (see 000001_invoices.up.sql). Changing to BIGINT here would introduce a third convention alongside the invoice (TEXT) and graph (BLOB) tables, without providing a consistency benefit. We'll keep it aligned with the invoice table.
Included PRsThis PR is a consolidation of the following series of PRs that were merged individually: #9147 - [Part 1|3] Introduce SQL Payment schema into LND This implements the SQL backend for the payments subsystem of LND. The following parts will focus on:
#10287 - [Part 2|*] Implement First Part for SQL Backend functions This PR implements the It does not yet add testing for these functions which will be added in followup PRs once we have the ability to add payments and attempts to the database which we then can fetch. #10291 - [Part 3|*] Implement Second Part for SQL Backend functions This PR implements most of the Insert and Delete methods we need for the unit tests in the following PR. #10368 - [Part 3b|*] Implement Third Part for SQL Backend functions Implements the Third and Final Part of Payments SQL backend functions. Follows after #10291. #10292 - [Part 4|*] Add unit tests for the previous introduced SQL Backend methods This PR adds the DB related unit tests for backend functions which were introduced in #10287, #10291 and #10368 and finalizes the implementation for the SQL backend. This PR now lets the unit tests and the itest pass for the SQL payments backend. #10307 - [Part 5|*] Thread Contexts through payment methods Part 1 Builds on top of the latest payment SQL changes and threads the context through payment functions. This is part 1 threads the context up to the Router Level. We can even further thread the context up to the rpc level, however we need to be careful which functions we allow to be canceled by the RPC level. But this is tackled in Part 2. #10308 - [Part 6|*] Thread Contexts through payment methods Part 2 Part 2 of the Context Threading refactor. #10373 - [Part 7|*] Addressing TODOs left from the SQL payments implementation This PR cleans up some TODOs of the payment sql series, before tackling the dispatcher and the migration. |
PR #9147 — [Part 1/3] Introduce SQL Payment schema into LND
✅ All 4 commits accounted for PR #10287 — [Part 2] Implement First Part for SQL Backend functions
✅ All 6 commits accounted for PR #10291 — [Part 3] Implement Second Part for SQL Backend functions
✅ All 13 commits accounted for PR #10368 — [Part 3b] Implement Third Part for SQL Backend functions
✅ All 12 commits accounted for PR #10292 — [Part 4] Add unit tests for SQL Backend methods
✅ All 13 commits accounted for PR #10307 — [Part 5] Thread Contexts through payment methods (Part 1)
✅ All 11 commits accounted for PR #10308 — [Part 6] Thread Contexts through payment methods (Part 2)
✅ All 8 commits accounted for PR #10373 — [Part 7] Addressing TODOs from SQL payments implementation
✅ All 7 commits accounted for Consolidation-branch-only commits4 commits have no counterpart in any constituent PR — original fixes applied |
|
Classified as |
We prepare the code for the sql payment backend. However no payment db interface method for the sql backend is implemented yet. This will be done in the following commits. They currently use the embedded KVStore to satify the build environment.
This does not include duplicate payments yet. They will be added when the migration code is introduced for payments.
We allow the migration of the payment schema to be applied when the test_native_sql is active to make sure the tables are properly contructed.
Previously a one(intent)-to-many(payment) relationship it is now changed to a one-to-one relationship because a payment request only can have 1 payment related to it. Looking into the future with BOLT12 offers, the fetched invoice from the offer could be stored here as well and the relationship would still hold.
A context is added to failPaymentAndAttempt and its dependant function calls.
We make the TestDeleteNonInFlight and separate all the logic out for the duplicate payment test case. The deletion of duplicate payments is now tested in isolation only for the kv backend.
Previously we had db and application logic mixed on the db level. We now move the config option KeepFailedPaymentAttempts to the ChannelRouter level and move it out of the db level.
We add a test which tests the retrieval of first hop data like the first hop amount or the custom records on the route level.
We add a couple of additional tests to increase the unit test coverage of the sql store but also the kv store. We only create db agnostic unit tests so both backends are tested effectively.
The 000009 schema version slot is now taken by 000009_graph_v2_columns which was merged ahead of the payments schema. Bump the payments schema file number to 000010 to avoid the collision. Moreover update the migration_dev.go file to reflect this change and update the order of migration.
2589333 to
190893d
Compare
190893d to
147db52
Compare
🟡 PR Severity: MEDIUM
🔴 Critical (4 files) — overridden by label
🟠 High (11 files) — overridden by label
🟡 Medium (14 files)
🟢 Low (1 file)
⚪ Excluded from counting (tests & generated)Test files: Auto-generated: AnalysisOverride in effect: A Natural classification would be CRITICAL due to:
Both severity-bump thresholds are exceeded (for informational purposes):
This PR consolidates the full SQL-backed payments database implementation, bringing together a series of smaller, individually-reviewed PRs. The To override, add a |
bba3e6c to
78ba5e4
Compare
…timestamps The SQL backend introduced in this PR was ignoring the SettleTime and FailTime fields provided in HTLCSettleInfo and HTLCFailInfo, instead always recording time.Now() as the resolution timestamp. The KV backend correctly serializes and deserializes these fields. The timestamps are set by the caller using a mockable clock (p.router.cfg.Clock.Now() in payment_lifecycle.go), so ignoring them means the stored timestamp reflects when the DB write happened rather than when the event occurred, breaking deterministic testing. This commit also extends the test assertions in assertPaymentInfo to verify that SettleTime and FailTime are correctly stored and retrieved by the SQL backend, and updates the relevant call sites to pass explicit timestamps so regressions are caught.
The SQL implementation collects payments into a map before converting to a slice, resulting in non-deterministic iteration order due to Go's intentional map randomisation. Sort the result by SequenceNum to produce a deterministic, insertion-ordered output. Note that the current sole caller (resumePayments in router.go) processes each payment independently, so this ordering does not affect any existing behaviour.
- Remove duplicate compile-time interface assertion for SQLStore. - Fix the down migration to drop payment_intents before payments to respect the foreign key dependency order. This was not a bug in the first place bc we have the CASCADE when deleting payments.
78ba5e4 to
e9a8826
Compare
| err = db.SettleAttempt(ctx, sqlc.SettleAttemptParams{ | ||
| AttemptIndex: int64(attemptID), | ||
| ResolutionTime: time.Now(), | ||
| ResolutionTime: settleInfo.SettleTime.UTC(), |
There was a problem hiding this comment.
Q: what happens if this field is not set? @gemini-code-assist
There was a problem hiding this comment.
That's a good question! In Go, time.Time is a struct, so it's always initialized to a value, even if it's the zero value (January 1, year 1, 00:00:00 UTC). The .UTC() method will convert this value to UTC.
Furthermore, the resolution_time column in the payment_htlc_attempt_resolutions table is defined as TIMESTAMP NOT NULL in the SQL schema. This means the database itself enforces that a value must always be provided for this field.
In the payment_lifecycle (where SettleAttempt is called), the SettleTime field of settleInfo is explicitly set using p.router.cfg.Clock.Now(). This function is designed to return the current time, ensuring that SettleTime is always a valid, non-zero timestamp in practice. Therefore, this field should always have a meaningful value when an attempt is settled.
Summary
This PR consolidates the payment SQL backend project, which was developed as a series of smaller, independently reviewed PRs that have now all been merged into this branch.
The work introduces a full SQL-backed implementation of the payments database, allowing lnd to store payment state in a relational database (SQLite/Postgres) in addition to the existing bbolt KV store.
What's included
FetchPayment,QueryPayments,InitPayment,RegisterAttempt,SettleAttempt,FailAttempt,Fail,DeletePayment,DeletePayments,DeleteFailedAttempts,FetchInFlightPaymentscontext.ContextFetchInflightPayments, first hop data,computePaymentStatusFromResolutions, and moreThis series was reviewed and merged incrementally; this PR brings the full implementation onto
master.