-
Notifications
You must be signed in to change notification settings - Fork 2.2k
chanfitness: track flap counts for peers with channels #10211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
chanfitness: track flap counts for peers with channels #10211
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @yyforyongyu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request streamlines the chanfitness
package by completely removing the peer flap count and detailed uptime tracking functionality. This decision was made because the feature was incomplete and alternative tools like faraday
provide similar capabilities, reducing maintenance overhead for the project.
Highlights
- Feature Removal: The 'peer flap count' functionality has been entirely removed from the
chanfitness
package. This includes all associated logic for tracking peer online/offline events, rate limiting, and persistence of flap counts. - Code Simplification: Significant code cleanup and simplification in
chanfitness/chanevent.go
andchanfitness/chaneventstore.go
, removing fields, functions, and event subscriptions related to peer online status and flap count management. - API Changes: The
peerMonitor
interface has been updated, and thechannelUptime
method is nowchannelLifetime
, returning only the channel's total monitored duration, without a separate uptime metric. RPC fieldsuptime
,flap_count
, andlast_flap_ns
are deprecated and will always return zero. - Test Suite Reduction: Corresponding test files and test cases for the removed flap count and uptime logic have been deleted or updated, streamlining the test suite.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the peer flap count functionality from the chanfitness
package to simplify maintenance. The changes are extensive, touching many files to remove related logic, dependencies, and API fields. The removal appears to be thorough and correct. My review focuses on ensuring that documentation and comments are updated to reflect these changes, and I've found a couple of outdated function comments that should be corrected for clarity and consistency.
786eca5
to
2156bfb
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the peer flap count functionality from the chanfitness
package to improve maintainability. The changes are extensive and consistently applied across the codebase, including logic, tests, and RPC interface definitions. The deprecated RPC fields are clearly marked in the .proto
files. I have one suggestion to use the idiomatic [deprecated = true]
option in the protobuf definitions for better tooling support, which would be a nice improvement.
9a2f116
to
624eff9
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the peer flap count and uptime tracking functionality from the chanfitness
package. The changes are extensive, touching application logic, tests, and API definitions to remove all traces of the feature. This includes removing state from peerLog
, deleting event handling and rate-limiting logic, and simplifying the ChannelEventStore
. Corresponding RPC fields in lightning.proto
have been deprecated, and related logic in rpcserver.go
and server.go
has been removed. The tests have been updated accordingly to reflect the removed functionality. Overall, this is a great simplification that improves the maintainability of the codebase, and the changes are well-executed and consistent.
Minimally perhaps we can keep uptime, even if only for peers we have channels with? |
I think we should first add a period where we depreciate these values but still fill the right values in there. There might be people using them? Are we planning to replace this feature with something different ? |
624eff9
to
0f779d0
Compare
Yeah I tried that too - the issue is, to calculate |
We rename `getPeerMonitor` to `getOrCreatePeerMonitor` for clarity - it will now send an online event when the peer monitor is being created, which means we will only start counting the peer as online when it has opened a channel with us.
Similar to how we handle `chanfitness.ErrChannelNotFound`, we now also ignore the `ErrPeerNotFound`. This is needed as previously we will always see the peer in the channel event store given it'd added when connected. This is no longer the case as we only add the peer to the map when the channel is added.
We need to make sure the `ListPeers` to be robust against errors from the `FlapCount` so this RPC won't fail due to no flap count info. Also updated the itest to check this field.
0f779d0
to
1dc239b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nice PR. I only have some small nits.
@@ -487,7 +498,7 @@ func (c *ChannelEventStore) getChanInfo(req channelInfoRequest) (*ChannelInfo, | |||
|
|||
peerMonitor, ok := c.peers[req.peer] | |||
if !ok { | |||
return nil, ErrPeerNotFound | |||
return nil, fmt.Errorf("%w: %v", ErrPeerNotFound, req.peer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love a sentinel error wrapped with additional context.
👍
switch err { | ||
switch { | ||
// If the store does not know about the peer, we just log it. | ||
case errors.Is(err, chanfitness.ErrPeerNotFound): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapped sentinel errors and having errors.Is
here! Love it.
👍
@@ -2242,7 +2242,7 @@ func (h *HarnessTest) ReceiveChannelEvent( | |||
|
|||
select { | |||
case <-time.After(DefaultTimeout): | |||
require.Fail(h, "timeout", "timeout intercepting htlc") | |||
require.Fail(h, "timeout", "timeout receiving channel events") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
piconit: This change isn't covered by the description in the commit message.
It's a textual improvement of an unrelated case.
// Finally we assert that the flap count is updated as expected. | ||
resp := alice.RPC.ListPeers() | ||
for _, p := range resp.Peers { | ||
// There's only one channel open event here so the flap count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment confuses me. Multiple channel open events won't increase the flap count, right? Only the first channel open event calls into peerMonitor.onlineEvent(true)
If my understanding is correct, this is what happens:
// The channel open event resulted in an online event, so we expect the flap count to be 1.
@@ -179,6 +179,10 @@ circuit. The indices are only available for forwarding events saved after v0.20. | |||
for the case where the DB is kvdb backed and no invoices have yet been added | |||
to the database. | |||
|
|||
* The `FlapCount` and `LastFlapNs` have been | |||
[changed](https://github.com/lightningnetwork/lnd/pull/10211) to track for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ultranit:
to track exclusively for peers that have channels with us
Since it already tracked for all peers, but no it doesn't track anymore for peers that don't have a channel with us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. Flap counts are most important for peers that have channels, so operators can decide if it's worth closing channels with frequently offline peers.
I think we could probably remove a lot more from chanfitness -- it seems like it's tracking more than necessary. But this PR seems like an improvement at least.
resp := alice.RPC.ListPeers() | ||
for _, p := range resp.Peers { | ||
// There's only one channel open event here so the flap count | ||
// should be 1. | ||
require.EqualValues(ht, 1, p.FlapCount) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should also assert that there's only 1 peer.
// Log the error if we cannot get the flap count instead | ||
// of failing this RPC call. | ||
if err != nil { | ||
return nil, err | ||
rpcsLog.Warnf("Failed to get flap count for "+ | ||
"peer %v", vertex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we now expect peers without channels to hit this case? If it's expected, it shouldn't be a "warning". Also seems it would create a lot of log spam.
We now change the flap count used in the
chanfitness
package - we now only track the values when the peer has a channel with us.