Skip to content

Conversation

@chuksys
Copy link
Contributor

@chuksys chuksys commented Jun 25, 2025

Description

This PR reports pathfinding outcomes to scorer in SimNode. This significantly improves the realism of pathfinding simulations by allowing the scorer to update based on actual payment success or failure.

This is an attempt to complete the work started on this branch.

Changes

  • Payment path now tracked in SimNode.in_flight using InFlightPayment.
  • Added IndexFailure PaymentOutcome variant, which specifically captures the index of the channel in the payment path that caused the failure.
  • Now tracking simulation time in SimNode using SimulationClock for accurate duration reporting.
  • Now reporting payment success/failure to SimNode.scorer using the newly tracked path and failure index (if applicable).

This PR closes #279

@chuksys chuksys force-pushed the report-pathfinding-outcomes branch 3 times, most recently from d151ed3 to 32c4089 Compare June 25, 2025 17:41
@carlaKC
Copy link
Contributor

carlaKC commented Jun 26, 2025

Looking really good overall!

Let's sort out these pesky MPP issues and make the clock generic then we're gg.

@chuksys
Copy link
Contributor Author

chuksys commented Jun 27, 2025

Thanks for the feedback. Will get this sorted asap!

@chuksys
Copy link
Contributor Author

chuksys commented Jul 7, 2025

Added Some Key Changes

  1. Reverted SimNode Immutability: I've reverted some changes introduced in 8ded28e to allow SimNode to mutate its internal fields where necessary.

  2. Added Test Cases for First Hop Failure: New test cases have been added to specifically verify that when a Some(0) is returned during the simulation's payment propagation, it correctly indicates a failure on the first hop (Index 0).

Copy link
Contributor

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions from a very high level pass.

Reverted SimNode Immutability: I've reverted some changes introduced in 8ded28e to allow SimNode to mutate its internal fields where necessary.

What's the motivation for reverting this change? If possible it's preferable to not require implementations of a trait to be mutable. Would rather add a mutex where we need it than re-introduce trait-required mutability.

node.1 .0.clone(),
graph.clone(),
routing_graph.clone(),
SystemClock {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pass this clock in as a parameter to give callers the freedom to choose what impl they want

}

/// This trait defines the "interface" for payment propagation logic
pub trait PaymentPropagator: Send + Sync + 'static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of boilerplate just for tests (and possibly somewhere we should use async_trait rather than have these boxes + pins?

Doesn't seem like it would be too much test code to just use real propagation and create payments that fail at the points we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right - I actually used async_trait in a fixup of the commit; However I think it's a good idea to use real propagation rather than have all this boilerplate. I'll work on that.

@chuksys
Copy link
Contributor Author

chuksys commented Jul 8, 2025

Would rather add a mutex where we need it than re-introduce trait-required mutability.

Thanks for this suggestion. Will work on this.

@chuksys chuksys requested a review from carlaKC July 9, 2025 00:46
Copy link
Contributor

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can go ahead and squash this then I'll take a last review round.

Just a few generics things that stand out to me.

// to a dyn trait and exclude any nodes that shouldn't be included in random activity
// generation.
let nodes = ln_node_from_graph(simulation_graph.clone(), routing_graph).await;
let nodes = ln_node_from_graph(simulation_graph.clone(), routing_graph, SystemClock {}).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use the clock passed in?


/// Provides a wall clock implementation of the Clock trait.
#[derive(Clone)]
#[derive(Clone, Copy)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add Copy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually don't need to - will take that out.


/// Produces a map of node public key to lightning node implementation to be used for simulations.
pub async fn ln_node_from_graph(
pub async fn ln_node_from_graph<C: Clock + std::marker::Copy>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than require this to be copy, pass in Arc<C>? This is what we do elsewhere

@chuksys chuksys force-pushed the report-pathfinding-outcomes branch from 2eeda7a to 3dfc98a Compare July 13, 2025 17:28
@chuksys chuksys requested a review from carlaKC July 14, 2025 08:10
@carlaKC
Copy link
Contributor

carlaKC commented Jul 15, 2025

Running on a mock network, I get the following error:
Track payment failed for fb01afc58a88b0d03eccb1c5f469800fdd923d5ab4381667ab56d47bdadcd2f6: Track payment error: payment hash fb01afc58a88b0d03eccb1c5f469800fdd923d5ab4381667ab56d47bdadcd2f6 not found

To Reproduce:
sim-cli -l debug -s ln_3.json -c 1000

ln_3.json

{
  "sim_network": [
    {
      "scid": 329853488398336,
      "capacity_msat": 5000000000,
      "node_1": {
        "pubkey": "02c2fab8d99106ce621cae6d35aaddcc5a13f6ae9c65f2c9cf2adc6570dc08482d",
        "alias": "A",
        "max_htlc_count": 483,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 1000,
        "max_htlc_size_msat": 4950000000,
        "cltv_expiry_delta": 40,
        "base_fee": 1000,
        "fee_rate_prop": 1
      },
      "node_2": {
        "pubkey": "03797da684da0b6de8a813f9d7ebb0412c5d7504619b3fa5255861b991a7f86960",
        "alias": "B",
        "max_htlc_count": 15,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 3000,
        "max_htlc_size_msat": 4455000000,
        "cltv_expiry_delta": 144,
        "base_fee": 1000,
        "fee_rate_prop": 542
      }
    },
    {
      "scid": 332052511653888,
      "capacity_msat": 5000000000,
      "node_1": {
        "pubkey": "03797da684da0b6de8a813f9d7ebb0412c5d7504619b3fa5255861b991a7f86960",
        "alias": "B",
        "max_htlc_count": 483,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 3000,
        "max_htlc_size_msat": 4455000000,
        "cltv_expiry_delta": 144,
        "base_fee": 1000,
        "fee_rate_prop": 542
      },
      "node_2": {
        "pubkey": "03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f",
        "alias": "C",
        "max_htlc_count": 15,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 1,
        "max_htlc_size_msat": 4455000000,
        "cltv_expiry_delta": 144,
        "base_fee": 1000,
        "fee_rate_prop": 499
      }
    }
  ],
  "exclude": [
    "03797da684da0b6de8a813f9d7ebb0412c5d7504619b3fa5255861b991a7f86960"
  ]
}

@chuksys
Copy link
Contributor Author

chuksys commented Jul 16, 2025

Thanks! Will look into this asap.

@carlaKC carlaKC removed their request for review July 16, 2025 18:37
@carlaKC
Copy link
Contributor

carlaKC commented Jul 29, 2025

Any updates here?

@chuksys
Copy link
Contributor Author

chuksys commented Jul 29, 2025

Any updates here?

Apologies for the delay! I'm working on sorting this out and expect to have an update for you by mid next week.

@chuksys chuksys force-pushed the report-pathfinding-outcomes branch from 3dfc98a to 378c3fa Compare August 4, 2025 16:09
@chuksys
Copy link
Contributor Author

chuksys commented Aug 4, 2025

Fix: Properly track payment hashes for failed route scenarios

Previously, payment hashes were being tracked even when they were not added to the in_flight hash map (in scenarios where no route was found). The changes I pushed now cover this case by inserting the payment hash and corresponding InFlightPayment (with path set to None).

I have tested using the same parameters and can confirm I don't see that error anymore:

sim-cli -l debug -s ln_3.json -c 1000

ln_3.json

{
  "sim_network": [
    {
      "scid": 329853488398336,
      "capacity_msat": 5000000000,
      "node_1": {
        "pubkey": "02c2fab8d99106ce621cae6d35aaddcc5a13f6ae9c65f2c9cf2adc6570dc08482d",
        "alias": "A",
        "max_htlc_count": 483,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 1000,
        "max_htlc_size_msat": 4950000000,
        "cltv_expiry_delta": 40,
        "base_fee": 1000,
        "fee_rate_prop": 1
      },
      "node_2": {
        "pubkey": "03797da684da0b6de8a813f9d7ebb0412c5d7504619b3fa5255861b991a7f86960",
        "alias": "B",
        "max_htlc_count": 15,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 3000,
        "max_htlc_size_msat": 4455000000,
        "cltv_expiry_delta": 144,
        "base_fee": 1000,
        "fee_rate_prop": 542
      }
    },
    {
      "scid": 332052511653888,
      "capacity_msat": 5000000000,
      "node_1": {
        "pubkey": "03797da684da0b6de8a813f9d7ebb0412c5d7504619b3fa5255861b991a7f86960",
        "alias": "B",
        "max_htlc_count": 483,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 3000,
        "max_htlc_size_msat": 4455000000,
        "cltv_expiry_delta": 144,
        "base_fee": 1000,
        "fee_rate_prop": 542
      },
      "node_2": {
        "pubkey": "03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f",
        "alias": "C",
        "max_htlc_count": 15,
        "max_in_flight_msat": 5000000000,
        "min_htlc_size_msat": 1,
        "max_htlc_size_msat": 4455000000,
        "cltv_expiry_delta": 144,
        "base_fee": 1000,
        "fee_rate_prop": 499
      }
    }
  ],
  "exclude": [
    "03797da684da0b6de8a813f9d7ebb0412c5d7504619b3fa5255861b991a7f86960"
  ]
}

Please let me know the outcome on your end. Thanks!

@chuksys chuksys requested a review from carlaKC August 4, 2025 16:42
Copy link
Contributor

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close, few more comments.

Go ahead and squash, I'll just look at the diff for the remaining changeset.

assert!(matches!(
result.unwrap().payment_outcome,
PaymentOutcome::Unknown
PaymentOutcome::IndexFailure(_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well assert on the value here?

Comment on lines +266 to +268
/// Error that occured when getting clock info.
#[error("SystemTime conversion error: {0}")]
SystemTimeConversionError(#[from] SystemTimeError),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used in this commit?

};

// Use the stored scorer when finding a route
let scorer_guard = self.scorer.lock().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do locking inside of find_payment_route?

if payment_result.payment_outcome == PaymentOutcome::Success {
self.scorer.lock().await.payment_path_successful(&in_flight.path, duration);
} else if let PaymentOutcome::IndexFailure(index) = payment_result.payment_outcome {
self.scorer.lock().await.payment_path_failed(&in_flight.path, index.try_into().unwrap(), duration);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index.try_into().unwrap() -> as u64?

Comment on lines 2075 to 2076
// Alter route to make payment fail on first hop (Index 0).
route.paths[0].hops[0].cltv_expiry_delta = 39;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test 👌

Let's make this a test for failures at each hop?

  • For i = 0; i < hops.len()
    • Modify hop at index i
    • Send payment
    • Assert failure at index i

Comment on lines 2090 to 2127
match result.payment_outcome {
PaymentOutcome::IndexFailure(_) => {
assert_eq!(result.payment_outcome, PaymentOutcome::IndexFailure(0));
},
_ => panic!(
"Expected IndexFailure, but got: {:?}",
result.payment_outcome
),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert!(matches!(
      result.payment_outcome,
      PaymentOutcome::IndexFailure(1)
))

/// The [`lightning::routing::router::build_route_from_hops`] function can be used to build the route to be passed here.
///
/// **Note:** The payment hash passed in here should be used in track_payment to track the payment outcome.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a fixup / squashed.

Comment on lines 762 to 763
None => {
log::warn!("No path found for payment hash {}, cannot update scorer.", hex::encode(hash.0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check this against the PaymentOutcome and error out if it's anything other than RouteNotFound? I don't think we need the warning for the RouteNotFound, as it's expected.

carlaKC and others added 3 commits August 7, 2025 17:18
This commit ensures that the payment apis can only send to a single path.

Payment is rejected if the route has multiple paths.
This commit asserts that payment failure on each hop returns the correct hop index where failure occured.

On payment failure, this ensures the correct index of hop where failure occured is reported to scorer.
@chuksys chuksys force-pushed the report-pathfinding-outcomes branch from 378c3fa to 5b490c0 Compare August 7, 2025 16:43
@chuksys chuksys requested a review from carlaKC August 7, 2025 16:59
@carlaKC
Copy link
Contributor

carlaKC commented Aug 8, 2025

Going to hold off on merge here until we've got #277 and #300 in - bigger feature so I want to give it the easier merge journey. I don't think the rebase will be too bad, and I'll do a last review round after that!

Copy link
Contributor

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a nit that's not worth blocking merge.

Also in future please try to wrap git messages at 27 chars.

// We will assert that the failure index returned is the index of the hop that we altered.
// This is done by setting the CLTV expiry delta to a value that will cause the payment to fail at that hop.
// The last hop will always succeed, so we only test the first n-1 hops.
for i in 0..route.paths[0].hops.len() - 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for updating this 👌

Comment on lines +2123 to +2126
assert!(matches!(
result.payment_outcome,
PaymentOutcome::IndexFailure(idx) if idx == i
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be: assert!(matches!(result.payment_outcome, PaymentOutcome::IndexFailure(i)));

@carlaKC carlaKC merged commit ac400cf into bitcoin-dev-project:main Aug 12, 2025
2 checks passed
@carlaKC carlaKC mentioned this pull request Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Report pathfinding outcomes to scorer in SimNode

3 participants