Skip to content

Introduce SessionEvent::Closed#1078

Merged
spacebear21 merged 3 commits intopayjoin:masterfrom
spacebear21:closed-session-event
Sep 22, 2025
Merged

Introduce SessionEvent::Closed#1078
spacebear21 merged 3 commits intopayjoin:masterfrom
spacebear21:closed-session-event

Conversation

@spacebear21
Copy link
Collaborator

@spacebear21 spacebear21 commented Sep 15, 2025

The first commit introduces the Closed session event. The second commit introduces Closed(SessionOutcome) which can be a Success, Failure, or Abort with a provided reason (e.g. for manual closures by implementers).

This PR only uses the Success reason when a response is successfully processed, but #1061 should update this to close the session only once a payjoin tx was actually detected. #1060 will use the Failure reason for closing sessions after an error has been handled. We'll also want to expose a method for implementers to close a session with a custom reason in another follow-up.

There is some overlap between this change and #1075 since this also modifies MaybeSuccessTransition cc @arminsabouri

Pull Request Checklist

Please confirm the following before requesting review:

@coveralls
Copy link
Collaborator

coveralls commented Sep 15, 2025

Pull Request Test Coverage Report for Build 17920670454

Details

  • 30 of 31 (96.77%) changed or added relevant lines in 3 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 84.766%

Changes Missing Coverage Covered Lines Changed/Added Lines %
payjoin/src/core/receive/v2/session.rs 18 19 94.74%
Files with Coverage Reduction New Missed Lines %
payjoin/src/core/receive/v2/mod.rs 1 92.52%
Totals Coverage Status
Change from base Build 17883934451: 0.02%
Covered Lines: 8580
Relevant Lines: 10122

💛 - Coveralls


(_, SessionEvent::SessionInvalid(_, _)) => Ok(ReceiveSession::TerminalFailure),

(current_state, SessionEvent::Closed) => Ok(current_state),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is actually necessary, in theory a closed session should never get replayed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would replay a closed session for getting the session history.
E.g if session expires we replay and get the fallback

Copy link
Collaborator

@arminsabouri arminsabouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still TBD whether this should be an inner value like Closed(Reason) or distinct SessionEvent variants for each known reason. I'm leaning towards the former if we can make it work.

I also agree with the former

@arminsabouri arminsabouri added this to the payjoin-1.0 milestone Sep 17, 2025
@spacebear21 spacebear21 marked this pull request as ready for review September 17, 2025 18:24
@spacebear21
Copy link
Collaborator Author

Instead of ClosureReason I went with SessionOutcome which more accurately describes the enum's purpose IMO.

Copy link
Collaborator

@arminsabouri arminsabouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK. The one thing I am ambivalent on is the inner string on the Abort variant

/// Payjoin failed to complete
Failure,
/// Payjoin was aborted by the caller with the provided reason
Abort(String),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to store the user provided reason on our event log? This does feel like purely an application concern.

Success,
/// Payjoin failed to complete
Failure,
/// Payjoin was aborted by the caller with the provided reason
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the caller strictly the application here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may be situations where the state machine aborts, e.g. if the monitoring typestate detects a doublespend? It is primarily there to expose an "abort" method to the application though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there should be a separate UserAbort (unit type) to clearly indicate that the user (not the app) chose to abort for whatever arbitrary reason (and if that reason is relevant the app should track it in its own data).

i'm not convinced we need another kind of abort though, a Failure (also still unit type) seems sufficient? my reasoning for this is kinda long sorry...

for double spends, from the pov of the receiver, and assuming the receiver is honest, i believe the full tree of protocol outcomes, happy path first is:

  • receiver generates pj URI
    • sender sends fallback via pj endpoint
      • receiver replies with payjoin psbt
        • sender broadcasts payjoin txn
        • receiver broadcasts fallback txn after expiration
        • sender broadcasts fallback txn
        • sender broadcasts payment txn (e.g. due to partial loss of state maybe?)
        • sender broadcasts double spend fallback/payjoin txn (without paying receiver)
      • receiver times out trying to send reply, or fails to sign, and broadcasts fallback txn
      • sender broadcasts fallback txn
      • sender broadcasts payment txn
      • sender double spends fallback txn (without paying receiver)
    • sender broadcasts payment txn

and subsequently, the txn replacement and confirmation (and reorg) state machine is the wallet's responsibility, like it is for its non-payjoin transactions (i.e. Success also indicates successfuly producing a payjoin txn, not necessarily confirming it or getting paid).

if a human user is involved in receiving, double spends may be intended, and wallets should not do anything automatically but could propose fee bumping txns that pay the wallet.

if the receiver is automated, it may be appropriate to respond to double spends by aggressive fee bumping, especially if UTXO probing is a consideration.

in both settings a double spend of the fallback is arguably a deviation from the protocol. imagine a scenario where responding takes a long time, e.g. the receiver must access a hardware wallet to respond with a payjoin psbt, and detects a double spend of the fallback before producing a payjoin psbt, or also covering the automated setting, maybe the directory is temporarily unavailable.

  • continue (in which case a malicious sender may learn a UTXO but not end up signing and still get in a fee bumping war over the double spend in the worst case, but without a reply the sender is definitely not signing the payjoin psbt, whereas if it does receive a reply there's at least a non 0 chance)
  • consider this a protocol deviation (which it arguably is) and mark the session as failed due to double spend (another Abort variant? or Failed with a reason?)
  • alert the user (only in the interactive setting) who can abort or not (in which case it's continue)

failing requires the app to notify us of this failure so in some sense it's just reporting a reason to close the session, and that reason might be a specific txn of interest, but it doesn't really make sense to treat this txid as necessary event data IMO, it is enough that we identify the txids of the success case(s) and let the wallet figure out that a conflicting txn exists, is winning a fee race, or got confirmed in line with its chain querying capabilities, as is the decision whether to engage in a fee bidding war.

where such an explicit failure or abort due to double spend event does make sense is when continuing is deemed inappropriate, and we want to prevent accidental replying if the session is resumed later.

if the log is replayed at this point and the failure outcome is recorded, then we would still have the fallback txn to give to the wallet and it could use its chainstate logic to figure out whether the failure was a double spend.

Comment on lines 207 to 234
/// Payjoin failed to complete
Failure,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we defined Failure? Is it just protocol and session errors? e.g would session expired get its own variant?

Copy link
Collaborator

@nothingmuch nothingmuch Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's two kinds of success that IMO are worth distinguishing:

  1. payment succeeded (valid fallback received) but payjoin failed
  2. payjoin succeeded (the payjoin state machine has be notified of this by the wallet, since we don't know the full payjoin txn until it's relayed or confirmed, the proposal is missing signatures)

in the 2nd scenario we can tell the wallet what txid confirmations to monitor for confirmation as well as produce the payjoin txn to (re)broadcast as appropriate

if the sender only used segwit inputs, we also know the txid to expect for the payjoin txn if the sender completes it but we can't rebroadcast it without being notified about it due to the signature data being missing

and the failure cases as i see them:

  1. abort (user decided to cancel)
  2. expiration (can be implicit as just lack of successs & expiry time having elapsed? not sure an event is needed unless we distrust the receiver's sytem clock)
  3. counterparty deviated from the protocol (invalid messages)

other errors like the directory or relay failing could be seen as transient until expiration unless the user aborts so i don't see much sense in recording those

finally i think we should be clear in documenting these that this is purely for the payjoin session's success, i.e. a (partly) successful payjoin session does not imply the payment itself succeeded as that depends on the confirmation which is the wallet's purview

So to make this actionable:

  1. Success should contain the missing sender's witnesses that allow completion of the payjoin psbt into a confirmable transaction and guarantee its txid can be computed (after Receiever Monitor Typestate #1061)
  2. The unit Failure variant should be renamed to indicate a deviation from the counterparty (maybe SenderFailure from the receiver's POV?)
  3. Expired could make sense as an additional variant if it can't be inferred
  4. there should be a method on the history that allows querying the status and returns an enum { Active, Closed(Outcome) } where Outcome is an enum that also contains the inferred failure modes that don't need an explicit event since they don't have any prior information to contribute to the log (e.g. Expired if it's not represented as a serialized event)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up. This all checks out with me.

Success should contain the missing sender's witnesses that allow completion of the payjoin psbt into a confirmable transaction and guarantee its txid can be computed (after #1061)

ack. However, should sender witness live in a seperate receiver session event? As I understand it, the closed event is payjoin role agnostic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As currently written the Closed event is specific to the Receiver (it lives inside receive::v2::session::SessionEvent). I think we'll also want to add a Closed variant to the Sender but it would be a separate type.

Comment on lines 207 to 234
/// Payjoin failed to complete
Failure,
Copy link
Collaborator

@nothingmuch nothingmuch Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's two kinds of success that IMO are worth distinguishing:

  1. payment succeeded (valid fallback received) but payjoin failed
  2. payjoin succeeded (the payjoin state machine has be notified of this by the wallet, since we don't know the full payjoin txn until it's relayed or confirmed, the proposal is missing signatures)

in the 2nd scenario we can tell the wallet what txid confirmations to monitor for confirmation as well as produce the payjoin txn to (re)broadcast as appropriate

if the sender only used segwit inputs, we also know the txid to expect for the payjoin txn if the sender completes it but we can't rebroadcast it without being notified about it due to the signature data being missing

and the failure cases as i see them:

  1. abort (user decided to cancel)
  2. expiration (can be implicit as just lack of successs & expiry time having elapsed? not sure an event is needed unless we distrust the receiver's sytem clock)
  3. counterparty deviated from the protocol (invalid messages)

other errors like the directory or relay failing could be seen as transient until expiration unless the user aborts so i don't see much sense in recording those

finally i think we should be clear in documenting these that this is purely for the payjoin session's success, i.e. a (partly) successful payjoin session does not imply the payment itself succeeded as that depends on the confirmation which is the wallet's purview

So to make this actionable:

  1. Success should contain the missing sender's witnesses that allow completion of the payjoin psbt into a confirmable transaction and guarantee its txid can be computed (after Receiever Monitor Typestate #1061)
  2. The unit Failure variant should be renamed to indicate a deviation from the counterparty (maybe SenderFailure from the receiver's POV?)
  3. Expired could make sense as an additional variant if it can't be inferred
  4. there should be a method on the history that allows querying the status and returns an enum { Active, Closed(Outcome) } where Outcome is an enum that also contains the inferred failure modes that don't need an explicit event since they don't have any prior information to contribute to the log (e.g. Expired if it's not represented as a serialized event)

@spacebear21
Copy link
Collaborator Author

@nothingmuch The third commit adds the status() helper method. Instead of nesting another enum here I think it makes sense to just enumerate all variants in the Status enum.

I kept the variants pretty basic for now but I expect #1060 and #1061 will add more as needed. We may also want more variants instead of just "Active" to differentiate between the "waiting for sender" and "receiver actively processing" stages (see also payjoin-cli StatusText https://github.com/payjoin/rust-payjoin/blob/master/payjoin-cli/src/app/v2/mod.rs#L60).

Copy link
Collaborator

@nothingmuch nothingmuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, just one bikeshedding comment re abort (already posted because i couldn't find the thread in the submit review ui how is github this incoherent it's amazing)

@nothingmuch The third commit adds the status() helper method. Instead of nesting another enum here I think it makes sense to just enumerate all variants in the Status enum.

👍 I can see why this would be more ergonomic

I kept the variants pretty basic for now but I expect #1060 and #1061 will add more as needed.

👍

We may also want more variants instead of just "Active" to differentiate between the "waiting for sender" and "receiver actively processing" stages (see also payjoin-cli StatusText https://github.com/payjoin/rust-payjoin/blob/master/payjoin-cli/src/app/v2/mod.rs#L60).

Hmm, isn't replaying more appropriate to figure that out since the typestate machinery already interprets events into a more granular and statically typed thing?

My reasoning for such a status method instead of representing the different terminal states as typestates is that for the final outcome you only have to look at the last event so it's simpler and more efficient than a full log replay.

For active session, would expect in a long running app there would be some sort of task management where the task that owns the Receiver<T> would make status information available much like payjoin-cli does, just pushing the status updates into a channel or writing them into shared state so a UI thread can display the status.

Comment on lines +232 to +230
/// TODO: this should include the sender's witnesses once tx monitoring is implemented
/// (this will ensure the payjoin txid can be computed)
Success,
Copy link
Collaborator

@arminsabouri arminsabouri Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include sender witness in this pub enum? If the purpose if to store and provide the payjoin txid, then we should store the sender witness after monitoring in a receiver session event and expose a seperate payjoin_txid method of SessionHistory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Monitoring success transition would result in a Closed(Success(witnesses)) instead of the PayjoinTransactionDetected event currently implemented in #1061 .

@spacebear21 spacebear21 force-pushed the closed-session-event branch 2 times, most recently from b5caa54 to 966b6c6 Compare September 19, 2025 18:49
Copy link
Collaborator

@nothingmuch nothingmuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits and the bikeshedding suggestion i made in DMs


// Represents the status of a session that can be inferred from the information in the session
// event log.
#[non_exhaustive]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, what other variants would we expect? not convinced non_exhaustive is needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was we might add more granular failure statuses per:

querying the status and returns an enum { Active, Closed(Outcome) } where Outcome is an enum that also contains the inferred failure modes that don't need an explicit event

But for more detail I guess doing a full replay and using the typestate machinery is more appropriate. Removed non_exhaustive in the latest push.

This allows capturing session closures explicitly in the session event log.
The outcome variants are kept to a minimum because we should always be
able to deduce the reason an outcome was reached, by replaying the
SessionHistory.
It returns a SessionStatus, which should enumerate the various statuses
that are useful to know about but do not require an explicit
SessionEvent, since they can be inferred from the event log.
@spacebear21
Copy link
Collaborator Author

Merging this based on prior approvals since nits have been addressed.

@spacebear21 spacebear21 merged commit 615b146 into payjoin:master Sep 22, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants