-
Notifications
You must be signed in to change notification settings - Fork 2.2k
htlcswitch: fix error misclassification for on-chain resolutions #10456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
htlcswitch: fix error misclassification for on-chain resolutions #10456
Conversation
Summary of ChangesHello @calvinrzachman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical bug within the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses a subtle bug in htlcswitch that led to misclassifying on-chain HTLC resolution errors. The fix of checking len(htlc.Reason) == 0 instead of htlc.Reason == nil is robust and handles the serialization artifact well. The addition of the TestExtractResult unit test is excellent, as it thoroughly covers various scenarios and prevents regressions. I have a couple of minor suggestions to improve comment clarity and fix a typo in the new test code.
htlcswitch/switch_test.go
Outdated
| isResolution: true, | ||
| }, | ||
| expectedResult: &PaymentResult{ | ||
| Error: NewDetailedLinkError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this test fails with the old nil check ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it fails since we fall into the default switch case within parseFailedPayment which attempts to perform error decryption.
|
not super high prio because the local router still flags it as a failed payment but I think we can add the fix if you polish this PR. |
|
Basically, this should improve the functioning of // 2. A resolution from the chain arbitrator, which possibly has no failure
// reason attached. |
bcb2bcd to
68c92d6
Compare
This commit adds a new test, TestOnChainResolutionFailure, which demonstrates a error misclassification in the htlcswitch. The test shows that when an on-chain resolution with a nil/empty error reason is read from the result store, it is misclassified by the parseFailedPayment function. On master, this leads to an incorrect error being returned to the ChannelRouter by the GetAttemptResult function. This test will fail on the current commit and will be fixed in the subsequent commit.
This commit fixes a bug in `parseFailedPayment` where an on-chain resolution with an empty reason was not being correctly identified after being deserialized from disk by SubscribeResult or GetResult. This impacts callers waiting for the final settle/fail result on an htlc attempt via GetAttemptResult. The lnwire codec transforms a `nil` reason into a non-nil, empty byte slice and bypasses the existing check for `htlc.Reason == nil`. Thankfully this does not crash the daemon. Rather, it just leads to the caller (in our case, the ChannelRouter) not actually receiving the expected FailPermanentChannelFailure link error.
68c92d6 to
24a7655
Compare
Change Description
This PR demonstrates a subtle bug in the
htlcswitchthat causes an incorrect error to be returned for on-chain HTLC resolutions. On the introduction of theswitchrpcservice in #9489, this bug can be escalated, leading to an ambiguous scenario where the Switch reports a payment result which has no forwarding error, pre-image, or encrypted error for the RPC client.Problem
When the
contractcourtresolves a dust HTLC on-chain, it sends aResolutionMsgto the switch that results in anUpdateFailHTLCwith anilerrorReason.The core issue stems from a a subtle artifact in the
networkResultstore's persistence logic:nilhtlc.Reasonslice is serialized to disk as a zero-length byte array.[]byte{}).This
nil->[]byte{}transformation causes the check for on-chain resolutions to fail. As a result, this code path, which is intended to returnFailPermanentChannelFailureto signal the channel is being resolved on-chain, is never executed for results read from disk.Instead, the code falls through to the
defaultcase. The Switch attempts to decrypt the empty slice, which fails, causingparseFailedPaymentto returnErrUnreadableFailureMessage.The impact on
masteris a misclassification of the error. TheChannelRouterreceives a generic, imprecise error instead of the correctFailPermanentChannelFailure, hindering its ability to accurately understand the payment's terminal state.Solution
The fix is to make the check in
parseFailedPaymentrobust to the serialization artifact by checking the length of the reason instead of whether it'snil.This change ensures that on-chain resolutions are correctly identified and processed, whether the
Reasonisnil(in-memory) or an empty slice (post-deserialization). This makes the intended code path reachable and allows the switch to return the precise error to theChannelRouter.Steps to Test
nil->[]byte{}transformation during a disk round-trip:go test -v -timeout 30s --tags dev -run ^TestNetworkResultSerialization$ github.com/lightningnetwork/lnd/htlcswitchTestExtractResult, which simulates the disk I/O, both with and without the final commit which provides the fix:go test -v -timeout 30s -run ^TestExtractResult$ github.com/lightningnetwork/lnd/htlcswitchPull Request Checklist
Testing