Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unlock Coins on Anchor TX Broadcast Failure #1348

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ffranr
Copy link
Contributor

@ffranr ffranr commented Feb 3, 2025

This PR ensures that asset coins are properly unlocked if an error occurs during the anchor transaction broadcast process. Previously, failed broadcasts could leave coins locked, causing send attempts to be stuck.

Key changes:

  • Unlock asset coins before returning an error when publishing an anchor transaction.
  • Ensure inputs are unlocked if an error occurs before reaching the broadcast complete state.
  • Introduce SendStateBroadcastComplete to better define when a transfer can be canceled.
  • Improve logging by reporting the final fee rate of the anchoring transaction and logging the wallet's minimum relay fee for better debugging.

@ffranr ffranr self-assigned this Feb 3, 2025
@ffranr ffranr force-pushed the unlock-coins-on-broadcast-fail branch 2 times, most recently from 3d7d78e to 8dba8e6 Compare February 3, 2025 16:27
@coveralls
Copy link

coveralls commented Feb 3, 2025

Pull Request Test Coverage Report for Build 13379509732

Details

  • 0 of 28 (0.0%) changed or added relevant lines in 4 files are covered.
  • 32 unchanged lines in 6 files lost coverage.
  • Overall coverage remained the same at 41.014%

Changes Missing Coverage Covered Lines Changed/Added Lines %
itest/assertions.go 0 1 0.0%
tapfreighter/parcel.go 0 2 0.0%
tapfreighter/wallet.go 0 8 0.0%
tapfreighter/chain_porter.go 0 17 0.0%
Files with Coverage Reduction New Missed Lines %
tapgarden/planter.go 2 64.98%
tappsbt/create.go 2 53.22%
tapfreighter/chain_porter.go 4 0.0%
tapgarden/caretaker.go 4 68.11%
commitment/tap.go 5 83.64%
universe/interface.go 15 50.65%
Totals Coverage Status
Change from base Build 13373496688: 0.0%
Covered Lines: 27339
Relevant Lines: 66657

💛 - Coveralls

Copy link
Contributor

@gijswijs gijswijs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to rethink this at a few levels. I think this might be a good time to introduce a failing state that properly handles releasing coins, releasing acnhor tx inputs and cleans up any lingering parcels. I've made some comments throughout where I see issues with this.

There are some points where I think the specific error message we get back from WalletKit informs us on how to proceed. I've mentioned this throughout as well.

Thirdly, we could also use a RPC endpoint that allows you to cancel stuck parcels. But mayby that's something for another PR.

// At this stage, the transaction has been broadcast to the network.
// From this point forward, the transfer cancellation methodology
// changes.
case SendStateBroadcastComplete:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really elegant, adding this extra state here.

@@ -1388,10 +1391,24 @@ func (p *ChainPorter) stateStep(currentPkg sendPackage) (*sendPackage, error) {
"transaction %v: %w", txHash, err)

case err != nil:
// Unlock the inputs we locked for this transfer before
// returning the error.
p.unlockInputs(ctx, &currentPkg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that every error we get back from PublishTransaction warrants calling unlockInputs. I'm not 100% sure that we couldn't get into a state where we do have a tx in the mempool, but still unlock the inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could explicitly check whether the error message contains "min relay fee not met" to specifically address this coin-locking issue when the wallet is bitcoind.

I agree that unlocking for a general error might be too risky. In a future update, we could consider checking the mempool from unlockInputs.

@@ -1474,7 +1491,7 @@ func (p *ChainPorter) unlockInputs(ctx context.Context, pkg *sendPackage) {
// sanity-check that we have known input commitments to unlock, since
// that might not always be the case (for example if another party
// contributes inputs).
if pkg.SendState < SendStateStorePreBroadcast &&
if pkg.SendState < SendStateBroadcastComplete &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove the lease on assets when we are in state SendStateStorePreBroadcast or SendStateBroadcast, we don't clean up the parcel. Which seems wrong to me. This wasn't fully the case in the old situation because we only released coins when the parcel wasn't persisted to disk. So at least upon restart, the parcel was lost, and the assets would truly be free again. In this PR, we get situations were we remove the lease on coins, but still have parcels persisted to disk, that are loaded after a restart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar problem occurs a few lines down (and wasn't introduced by this PR, but should still be handled)

err := p.cfg.Wallet.UnlockInput(ctx, op)

In states SendStateStorePreBroadcast and SendStateBroadcast (which are the final two states that call into unlockInputs) we try to release the inputs of the anchor tx but we this leaves us with a situation where we still have the parcel. This PR changes the behavior and at least releases the assets, but we are still stuck with a pending parcel.

Also, we should inform the user what to do if p.cfg.Wallet.UnlockInput fails, because now he has a potential asset burning tx in his lnd node that we couldn't cancel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should cleanup the pending parcel as well.

Also, we should inform the user what to do if p.cfg.Wallet.UnlockInput fails, because now he has a potential asset burning tx in his lnd node that we couldn't cancel.

I don't think we currently call unlockInputs in state broadcast complete or beyond. So LND won't have a burning tx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, in the worst case, if we don't cancel the pending parcel then upon resumption, the asset coins would have been spent and the transaction would fail because the UTXO would have been spent.

@Roasbeef
Copy link
Member

Roasbeef commented Feb 4, 2025

I think before broadcast, we can call into CheckMempoolAccept, then only broadcast if that returns that it's likely to be accepted.

Here's an example in lnd: https://github.com/lightningnetwork/lnd/blob/6bf895aeb929fb64265bfb226f796392921d8d3a/lnwallet/btcwallet/btcwallet.go#L1928-L1957

@ffranr ffranr force-pushed the unlock-coins-on-broadcast-fail branch from e0c9661 to 19edeee Compare February 4, 2025 19:57
@ffranr ffranr removed the request for review from GeorgeTsagk February 5, 2025 13:39
A user encountered an error where the anchor transaction fee was
reported to be lower than the wallet's minimum relay fee. To provide
more context, log the minimum relay fee as reported by the wallet.
The absolute fee is already reported. This change calculates and reports
the final fee rate using the virtual size of the completed transaction.
Introduce a new send state to act as a threshold for transfer
cancellation handling. After this state, the anchor transaction may
already be in the mempool or confirmed on-chain. Before reaching this
point, however, the broadcast may have failed, requiring different
handling.
Update unlock logic to ensure that errors encountered during the
anchor tx broadcast process, but before reaching the broadcast complete
state, result in coins being unlocked. This prevents assets from
remaining locked in case of intermediate failures.
Ensure locked asset coins are released before returning an error when
attempting to publish the anchor transaction. Without this change, if
the chain bridge logic determines the anchor transaction should not be
broadcast, the coins remain locked, leaving the send attempt stuck in
limbo.
@ffranr ffranr force-pushed the unlock-coins-on-broadcast-fail branch from 19edeee to cec657e Compare February 17, 2025 22:27
@lightninglabs-deploy
Copy link

@ffranr, remember to re-request review from reviewers when ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

5 participants