-
Notifications
You must be signed in to change notification settings - Fork 403
LSPS2: Fail (or abandon) intercepted HTLCs if LSP channel open fails #3712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
LSPS2: Fail (or abandon) intercepted HTLCs if LSP channel open fails #3712
Conversation
👋 Thanks for assigning @tnull as a reviewer! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3712 +/- ##
=======================================
Coverage 89.35% 89.36%
=======================================
Files 157 157
Lines 124095 124178 +83
Branches 124095 124178 +83
=======================================
+ Hits 110886 110966 +80
Misses 10488 10488
- Partials 2721 2724 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f50496f
to
0d8ba10
Compare
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 3rd Reminder Hey @tnull! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this, and excuse the delay! We'll probably wait with this until after #3509 landed either way though.
/// without resetting the state for a potential payment retry. It removes the intercept SCID | ||
/// mapping along with any outbound channel state and related channel ID mappings associated with | ||
/// the specified `user_channel_id`. | ||
pub fn channel_open_abandoned( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to make this a bit safer, it should probably fail if we're already at PendingPaymentForward
or later, i.e., we already opened the channel?
Also, what would happen if this is called after the LSPS2ServiceEvent::OpenChannel
has been emitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now when abandon()
is called, if it's not at PendingInitialPayment
or PendingChannelOpen
, then it fails.
Also, what would happen if this is called after the LSPS2ServiceEvent::OpenChannel has been emitted?
You mean if the channel is already opened, but our state machine hasn’t caught up yet and still thinks we're at PendingChannelOpen
? (Is that possible?). In that case, if we call abandon()
, we’d end up forgetting about a channel that was actually opened, potentially leaving funds locked. If that scenario is possible and we can't reliably detect it, then this version of abandon()
isn't safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean if the channel is already opened, but our state machine hasn’t caught up yet and still thinks we're at PendingChannelOpen? (Is that possible?).
Yes. I think it would be possible if the user had already initiated a channel open but hasn't gotten the ChannelReady
yet? It might be worth documenting that the user is responsible for cleaning up any pending channel opens.
})?; | ||
|
||
let jit_channel = peer_state | ||
.outbound_channels_by_intercept_scid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation seems off.
), | ||
})?; | ||
|
||
jit_channel.state = match &jit_channel.state { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use the matches!
macro here?
@@ -24,6 +25,7 @@ use lightning_invoice::{Bolt11Invoice, InvoiceBuilder, RoutingFees}; | |||
use bitcoin::hashes::{sha256, Hash}; | |||
use bitcoin::secp256k1::{PublicKey, Secp256k1}; | |||
use bitcoin::Network; | |||
use lightning_types::payment::PaymentHash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move import up.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
0d8ba10
to
5539995
Compare
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 3rd Reminder Hey @tnull! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse the delay here!
@@ -11,6 +11,7 @@ | |||
|
|||
use alloc::string::{String, ToString}; | |||
use alloc::vec::Vec; | |||
use lightning::util::hash_tables::HashSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this import down to the other lightning
imports.
/// without resetting the state for a potential payment retry. It removes the intercept SCID | ||
/// mapping along with any outbound channel state and related channel ID mappings associated with | ||
/// the specified `user_channel_id`. | ||
pub fn channel_open_abandoned( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean if the channel is already opened, but our state machine hasn’t caught up yet and still thinks we're at PendingChannelOpen? (Is that possible?).
Yes. I think it would be possible if the user had already initiated a channel open but hasn't gotten the ChannelReady
yet? It might be worth documenting that the user is responsible for cleaning up any pending channel opens.
.outbound_channels_by_intercept_scid | ||
.get_mut(&intercept_scid) | ||
.ok_or_else(|| APIError::APIMisuseError { | ||
err: format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting is off here.
if let OutboundJITChannelState::PendingChannelOpen { payment_queue, .. } = | ||
&jit_channel.state | ||
{ | ||
Arc::clone(payment_queue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a rebase as post-#3509 the payment queue isn't held in an Arc
. We should also avoid constructions that introduce reachable panics like this.
}, | ||
_ => panic!("Expected OpenChannel event"), | ||
}; | ||
assert!(open_channel_event); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unnecessary as you already panic
in the _
arm above. Same below for the reopen_channel_event
bool.
|
||
#[test] | ||
fn channel_open_failed() { | ||
let promise_secret = [42; 32]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we DRY-up the redudant setup code? It would make the individual test cases much more concise and readable.
Add two LSPS2Service methods: 'Abandoned' prunes all channel open state. 'Failed' resets JIT channel to fail HTLCs. It allows a retry on channel open. Closes lightningdevkit#3479.
aad24ac
to
694006d
Compare
Add integration tests to verify channel open reset and pruning handlers. Tests cover: - channel_open_failed resetting state to allow retry. - channel_open_failed error on invalid state. - channel_open_abandoned pruning all open state. - error handling for nonexistent channels.
694006d
to
bf68eda
Compare
Thanks for the review @tnull. PR rebased and comments addressed! |
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 3rd Reminder Hey @tnull! This PR has been waiting for your review. |
🔔 4th Reminder Hey @tnull! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse the delay here.
/// | ||
/// This removes the intercept SCID, any outbound channel state, and associated | ||
/// channel‐ID mappings for the specified `user_channel_id`, but only while the JIT | ||
/// channel is still in `PendingInitialPayment` or `PendingChannelOpen`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: OutboundJITChannelState
is currently not exposed as part of the public API (which might be another task at some point, but this discussion is unrelated). Hence, let's drop these state names and rather describe the behavior.
/// | ||
/// Returns an error if: | ||
/// - there is no channel matching `user_channel_id`, or | ||
/// - the channel has already advanced past `PendingChannelOpen` (e.g. to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: let's drop the state name, and describe what this case means.
/// | ||
/// Note: this does *not* close or roll back any on‐chain channel which may already | ||
/// have been opened. The caller must only invoke this before a channel is actually | ||
/// confirmed (or else provide its own on‐chain cleanup) and is responsible for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, a user needs to call this before/in-place of initiating the channel open, not only before it's confirmed, no?
if let Some(jit_channel) = | ||
peer_state.outbound_channels_by_intercept_scid.get(&intercept_scid) | ||
{ | ||
if !matches!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we move the matches
into a bool variable? This would let the if
read a bit better.
OutboundJITChannelState::PendingInitialPayment { .. } | ||
| OutboundJITChannelState::PendingChannelOpen { .. } | ||
) { | ||
return Err(APIError::APIMisuseError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, if we return with an error here, are we fine that we already removed the intercept SCID above? Or does this need to happen atomically, i.e., should we only remove the intercept SCID from intercept_scid_by_user_channel_id
if we succeed here?
), | ||
})?; | ||
|
||
if !matches!(jit_channel.state, OutboundJITChannelState::PendingChannelOpen { .. }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop this, since we're doing the same check match
ing below anyways?
|
||
let mut payment_queue = match &jit_channel.state { | ||
OutboundJITChannelState::PendingChannelOpen { payment_queue, .. } => { | ||
payment_queue.clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to avoid this clone
here if we follow the core::mem::take
pattern we use elsewhere for state transitions.
}); | ||
}, | ||
}; | ||
let payment_hashes: Vec<_> = payment_queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think deduplication over payment hash warrants collect
ing twice here. In fact, fail_htlc_backwards_with_reason
should work idempotently, so we can just iterate over payment_queue
directly.
Closes #3479
Allows the LSP to signal when a channel open fails or is abandoned.
PendingInitialPayment
, allowing the payer to retry.Inspired by previous work and a suggestion from @tnull to add an abandon variant apart from the reset.