Skip to content

Conversation

yyforyongyu
Copy link
Member

@yyforyongyu yyforyongyu commented Sep 12, 2025

We now change the flap count used in the chanfitness package - we now only track the values when the peer has a channel with us.

@yyforyongyu yyforyongyu self-assigned this Sep 12, 2025
@yyforyongyu yyforyongyu added rpc Related to the RPC interface p2p Code related to the peer-to-peer behaviour size/kilo medium, proper context needed, less than 1000 lines labels Sep 12, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @yyforyongyu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request streamlines the chanfitness package by completely removing the peer flap count and detailed uptime tracking functionality. This decision was made because the feature was incomplete and alternative tools like faraday provide similar capabilities, reducing maintenance overhead for the project.

Highlights

  • Feature Removal: The 'peer flap count' functionality has been entirely removed from the chanfitness package. This includes all associated logic for tracking peer online/offline events, rate limiting, and persistence of flap counts.
  • Code Simplification: Significant code cleanup and simplification in chanfitness/chanevent.go and chanfitness/chaneventstore.go, removing fields, functions, and event subscriptions related to peer online status and flap count management.
  • API Changes: The peerMonitor interface has been updated, and the channelUptime method is now channelLifetime, returning only the channel's total monitored duration, without a separate uptime metric. RPC fields uptime, flap_count, and last_flap_ns are deprecated and will always return zero.
  • Test Suite Reduction: Corresponding test files and test cases for the removed flap count and uptime logic have been deleted or updated, streamlining the test suite.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the peer flap count functionality from the chanfitness package to simplify maintenance. The changes are extensive, touching many files to remove related logic, dependencies, and API fields. The removal appears to be thorough and correct. My review focuses on ensuring that documentation and comments are updated to reflect these changes, and I've found a couple of outdated function comments that should be corrected for clarity and consistency.

@yyforyongyu
Copy link
Member Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the peer flap count functionality from the chanfitness package to improve maintainability. The changes are extensive and consistently applied across the codebase, including logic, tests, and RPC interface definitions. The deprecated RPC fields are clearly marked in the .proto files. I have one suggestion to use the idiomatic [deprecated = true] option in the protobuf definitions for better tooling support, which would be a nice improvement.

@yyforyongyu yyforyongyu force-pushed the remove-flap-count branch 3 times, most recently from 9a2f116 to 624eff9 Compare September 12, 2025 15:03
@yyforyongyu
Copy link
Member Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the peer flap count and uptime tracking functionality from the chanfitness package. The changes are extensive, touching application logic, tests, and API definitions to remove all traces of the feature. This includes removing state from peerLog, deleting event handling and rate-limiting logic, and simplifying the ChannelEventStore. Corresponding RPC fields in lightning.proto have been deprecated, and related logic in rpcserver.go and server.go has been removed. The tests have been updated accordingly to reflect the removed functionality. Overall, this is a great simplification that improves the maintainability of the codebase, and the changes are well-executed and consistent.

@Roasbeef
Copy link
Member

Error: ./rpcserver.go:5067:31: info.Uptime undefined (type *chanfitness.ChannelInfo has no field or method Uptime)

Minimally perhaps we can keep uptime, even if only for peers we have channels with?

@ziggie1984
Copy link
Collaborator

I think we should first add a period where we depreciate these values but still fill the right values in there. There might be people using them? Are we planning to replace this feature with something different ?

@yyforyongyu
Copy link
Member Author

Minimally perhaps we can keep uptime, even if only for peers we have channels with?

Yeah I tried that too - the issue is, to calculate uptime we need to track the peer's connect/disconnect events, which means we need to decide to track the peer or not based on whether there's a channel. Hmmm as I'm typing I think there's another way to implement this, let me try.

We rename `getPeerMonitor` to `getOrCreatePeerMonitor` for clarity - it
will now send an online event when the peer monitor is being created,
which means we will only start counting the peer as online when it has
opened a channel with us.
Similar to how we handle `chanfitness.ErrChannelNotFound`, we now also
ignore the `ErrPeerNotFound`. This is needed as previously we will
always see the peer in the channel event store given it'd added when
connected. This is no longer the case as we only add the peer to the map
when the channel is added.
We need to make sure the `ListPeers` to be robust against errors from
the `FlapCount` so this RPC won't fail due to no flap count info. Also
updated the itest to check this field.
@yyforyongyu yyforyongyu changed the title chanfitness: remove peer flap count chanfitness: track flap counts for peers with channels Sep 15, 2025
@yyforyongyu yyforyongyu added this to the v0.20.0 milestone Sep 15, 2025
@yyforyongyu yyforyongyu moved this to In review in lnd v0.20 Sep 16, 2025
@yyforyongyu yyforyongyu added the P0 very high priority issue/PR, blocker on all others label Sep 16, 2025
Copy link
Collaborator

@gijswijs gijswijs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice PR. I only have some small nits.

@@ -487,7 +498,7 @@ func (c *ChannelEventStore) getChanInfo(req channelInfoRequest) (*ChannelInfo,

peerMonitor, ok := c.peers[req.peer]
if !ok {
return nil, ErrPeerNotFound
return nil, fmt.Errorf("%w: %v", ErrPeerNotFound, req.peer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love a sentinel error wrapped with additional context.

👍

switch err {
switch {
// If the store does not know about the peer, we just log it.
case errors.Is(err, chanfitness.ErrPeerNotFound):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped sentinel errors and having errors.Is here! Love it.

👍

@@ -2242,7 +2242,7 @@ func (h *HarnessTest) ReceiveChannelEvent(

select {
case <-time.After(DefaultTimeout):
require.Fail(h, "timeout", "timeout intercepting htlc")
require.Fail(h, "timeout", "timeout receiving channel events")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

piconit: This change isn't covered by the description in the commit message.

It's a textual improvement of an unrelated case.

// Finally we assert that the flap count is updated as expected.
resp := alice.RPC.ListPeers()
for _, p := range resp.Peers {
// There's only one channel open event here so the flap count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment confuses me. Multiple channel open events won't increase the flap count, right? Only the first channel open event calls into peerMonitor.onlineEvent(true)

If my understanding is correct, this is what happens:

// The channel open event resulted in an online event, so we expect the flap count to be 1.

@@ -179,6 +179,10 @@ circuit. The indices are only available for forwarding events saved after v0.20.
for the case where the DB is kvdb backed and no invoices have yet been added
to the database.

* The `FlapCount` and `LastFlapNs` have been
[changed](https://github.com/lightningnetwork/lnd/pull/10211) to track for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultranit:

to track exclusively for peers that have channels with us

Since it already tracked for all peers, but no it doesn't track anymore for peers that don't have a channel with us.

Copy link
Collaborator

@morehouse morehouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Flap counts are most important for peers that have channels, so operators can decide if it's worth closing channels with frequently offline peers.

I think we could probably remove a lot more from chanfitness -- it seems like it's tracking more than necessary. But this PR seems like an improvement at least.

Comment on lines +262 to +267
resp := alice.RPC.ListPeers()
for _, p := range resp.Peers {
// There's only one channel open event here so the flap count
// should be 1.
require.EqualValues(ht, 1, p.FlapCount)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should also assert that there's only 1 peer.

Comment on lines +3565 to +3569
// Log the error if we cannot get the flap count instead
// of failing this RPC call.
if err != nil {
return nil, err
rpcsLog.Warnf("Failed to get flap count for "+
"peer %v", vertex)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we now expect peers without channels to hit this case? If it's expected, it shouldn't be a "warning". Also seems it would create a lot of log spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 very high priority issue/PR, blocker on all others p2p Code related to the peer-to-peer behaviour rpc Related to the RPC interface size/kilo medium, proper context needed, less than 1000 lines
Projects
Status: In review
Development

Successfully merging this pull request may close these issues.

5 participants