Skip to content

Add ChannelScore struct (WIP) #626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

sourabhmarathe
Copy link
Contributor

@sourabhmarathe sourabhmarathe commented May 19, 2020

Addressing issue #482

Adds ChannelScore struct to network_graph.rs, a couple methods that handle PaymentSent/Failed events by setting their sent/failed channel_score fields within the DirectionalChannelInfo's stored in ChannelInfo. Most of the added code was the test which involved creating a dummy NetworkGraph object, which accounts for most of the 500+ lines of added code.

Lastly, the fields in DirectionalChannelInfo were renamed from one and two to alice and bob to make them a bit easier to read.

@codecov
Copy link

codecov bot commented May 19, 2020

Codecov Report

Merging #626 into master will increase coverage by 0.08%.
The diff coverage is 94.67%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #626      +/-   ##
==========================================
+ Coverage   91.26%   91.35%   +0.08%     
==========================================
  Files          35       35              
  Lines       20929    21229     +300     
==========================================
+ Hits        19101    19393     +292     
- Misses       1828     1836       +8     
Impacted Files Coverage Δ
lightning/src/ln/functional_test_utils.rs 94.67% <ø> (ø)
lightning/src/routing/router.rs 96.46% <ø> (ø)
lightning/src/util/events.rs 28.42% <0.00%> (-0.62%) ⬇️
lightning/src/routing/network_graph.rs 92.58% <95.41%> (+0.96%) ⬆️
lightning/src/ln/channelmanager.rs 85.51% <100.00%> (+0.05%) ⬆️
lightning/src/ln/functional_tests.rs 97.11% <100.00%> (ø)
lightning/src/ln/msgs.rs 90.59% <0.00%> (+0.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2087032...7d2f585. Read the comment docs.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this patch ! It's on the good way but need more thinking on the API. You pass a PublicKey to score_payment_failed_for_route but you may have multiple faultive node for a MPP payment, and I don't think this is encompassed.

What about writing a test to see how everything fit together and iterate from this ?

@sourabhmarathe
Copy link
Contributor Author

Thanks for your review, I have updated the PR to reflect the comments. I will make sure to write a test that tests the functionality of the scoring methods before I publish the PR.

@sourabhmarathe sourabhmarathe force-pushed the score-payments branch 4 times, most recently from 931868a to ec41152 Compare May 24, 2020 01:53
let mut network = net_graph_msg_handler.network_graph.write().unwrap();
let chan_id = network.get_nodes().get(&node_id_1).unwrap().channels[0];
let channel_info = network.channels.get_mut(&chan_id).unwrap();
// Assign a DirectionalChannelInfo and ChannelScore object to one_to_two of
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one_to_two is a awful name, if you have a better idea for naming you got my Concept ACK (alice_to_bob ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like alice_to_bob. Going to go with that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alice and Bob are suitable in test code when referring to specific examples. However, I don't think the convention should be extended to production code when referring to a general node.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, mostly minor points, but that's cool work :)

@jkczyz jkczyz self-requested a review May 26, 2020 02:28
@sourabhmarathe sourabhmarathe marked this pull request as ready for review May 28, 2020 00:40
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really awesome work, and the code looks good at first glance. I did have one pretty high level comment that would be great to see addressed before I do too deep a dive, hwoever.

/// the channel. Right now it stores a count of PaymentSent and PaymentFailed
/// occurances for a given channel, but there's a lot more that could be
/// added down the road (eg uptime, fee rates, variation, floppiness...)
pub struct ChannelScore {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think this needs to be abstract - as you point out, there are a lot of potential things that we could use here, and, more importantly, not all of them are going to be used in every case. eg, we might have a ProbingChannelScore-er which constantly probes the network, or we might only pull channel scores from failed/successful payments, as you do here, or a user might download channel scores from a server which provides some set of channel scoring information. Forcing users into a given set of score info isn't sufficient to capture what users may want.

Instead, we may consider generic-izing this to something which returns a "fee penalty" - to keep the route-finding algorithm tractible, the best thing we can do is have the score information be fed in as "fee we're willing to pay to avoid using this channel". Then, we can pass the score_payment_failed_for_route, etc calls into the trait, and have the router pull fee penalty information out of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to make sure I understand what you mean by generic-izing. Would the following be a valid approach to generalizing this to whatever a user may need?

Define a trait called ChannelScorer that requires a function to be defined that returns a fee_penalty: u64. Then, instead of the current channel_score: ChannelScore attribute in DirectionalChannelInfo, there would be an attribute called channel_score: Option<T: impl ChannelScorer>. Basically, any struct that can compute its own fee_penalty would be eligible to serve as that attribute. And if the user did not want to use that, it could just be a None.

Concretely, a user could use the current ChannelScore for this. All the user would have to do is implement the required fee_penalty method and that could be used for route-finding. In the event that the user did not provide a struct for ChannelScorer, we could just go with the default route finding algorithm.

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreement with @TheBlueMatt. Also left some comments as I read through the code. No need to address them until the higher-level design is resolved.

Additionally, from what I can tell, the bulk of the non-test code is in score_payment_sent_for_route and score_payment_failed_for_route, but these are only called from within the newly added test. Did you plan to use these elsewhere?

@@ -98,6 +98,8 @@ pub enum Event {
/// the payment has failed, not just the route in question. If this is not set, you may
/// retry the payment via a different route.
rejected_by_dest: bool,
/// Indicates the node responsible for the failure in the event of NodeFailure event.
faultive_node: Option<PublicKey>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you name this faulty_node? I don't believe "faultive" is a word in English. 😃

@@ -98,6 +98,8 @@ pub enum Event {
/// the payment has failed, not just the route in question. If this is not set, you may
/// retry the payment via a different route.
rejected_by_dest: bool,
/// Indicates the node responsible for the failure in the event of NodeFailure event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a bit awkward in that it is overloading the word "event" in a few ways:

  1. PaymentFailed itself is an event
  2. NodeFailure is a case of HTLCFailChannelUpdate which I suppose could be thought of as an event itself, but in a different context
  3. in the idiom "in the event of"

I'd suggest rephrasing simply as:

Indicates the node responsible for the failed payment.

if let Some(update) = channel_update {
faultive_node = match update.clone() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use clone here.

if let Some(update) = channel_update {
faultive_node = match update.clone() {
msgs::HTLCFailChannelUpdate::NodeFailure {node_id, is_permanent: _} => Some(node_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to list is_permanent if not used:

msgs::HTLCFailChannelUpdate::NodeFailure {node_id, ..} => Some(node_id),

@@ -202,6 +203,36 @@ impl<C: Deref + Sync + Send, L: Deref + Sync + Send> RoutingMessageHandler for N
}
}

#[derive(PartialEq, Debug)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this line below the documentation? As the docs grow, it can be more difficult to see what type these correspond to.

Comment on lines +207 to +208
/// Tracks the score of a payment by holding onto certain metadata about
/// the channel. Right now it stores a count of PaymentSent and PaymentFailed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say the "score of a channel"? More broadly, if this is metadata used to score a channel, how is the score itself calculated? Guessing eventually a score will be produced from this data, but just want to verify.

/// cooperatively executed a successful payment. In the future, this could be more nuanced (eg
/// only score the direction that cooperates, bring in other parameters for the channel_score
/// field within DirectionalChannelInfo)
pub fn score_payment_sent_for_route(&mut self, route: Vec<Vec<RouteHop>>) -> Result<bool, LightningError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, regarding naming isn't it the channel that is being scored, not the payment?

Comment on lines +543 to +546
/// Increments payment_sent_score for both DirectionalChannelInfos on any channel that
/// cooperatively executed a successful payment. In the future, this could be more nuanced (eg
/// only score the direction that cooperates, bring in other parameters for the channel_score
/// field within DirectionalChannelInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you generalize this documentation (and possibly method name)? As it stands, it discusses implementation details which would have to be updated as the code changes.

}
if channel_down {
return Err(LightningError {
err: "PaymentSent event occurred for this channel, but the channel was closed before scoring",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would someone reading this error message know which channel is being referenced?

Comment on lines +570 to +574
/// Increments payment_failed_score for approriate DirectionalChannelInfos on the network
/// graph. Like the case for scoring PaymentSent events, this method uses the node's PublicKey
/// to identify the appropriate DirectionalChannelInfo to score. From there, there is a check
/// against a list of at fault nodes that determines whether the node's channel score should be
/// penalized for failing to execute or rewarded for executing properly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also strikes me as simply repeating the implementation details.

@sourabhmarathe
Copy link
Contributor Author

Agreement with @TheBlueMatt. Also left some comments as I read through the code. No need to address them until the higher-level design is resolved.

Additionally, from what I can tell, the bulk of the non-test code is in score_payment_sent_for_route and score_payment_failed_for_route, but these are only called from within the newly added test. Did you plan to use these elsewhere?

@jkczyz Yup, I believe the idea was that a user could use those functions in response to PaymentSent/Failed events. As Matt pointed out, since there are a variety of use cases for a ChannelScore, it may be more appropriate to create a trait that the user defines themself and then calls as they require. I would be curious to see what @ariard thinks of this as well.

@sourabhmarathe
Copy link
Contributor Author

Given that the code will have to change at a high level, I am going to close this PR. I will be creating the trait for interfacing between the NetworkGraph and any user's channel metadata, with ChannelInfo storing an fee_penalty: Option to indicate the willingness of a user to utilize a channel based on its metadata, performance, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants