feat(dot/peerset): Instrument peerset with peer reputation and number of bans #2241

timwu20 · 2022-01-27T16:21:26Z

Changes

add metrics for reputation score of peers with labelled peer ID
add metrics for peer bannings labelled with labelled peer ID

Tests

go test ./dot/peerset

Issues

Primary Reviewer

@EclesioMeloJunior

EclesioMeloJunior

👍

codecov · 2022-01-27T16:46:13Z

Codecov Report

Merging #2241 (73a98ae) into development (a90a6e0) will increase coverage by 0.29%.
The diff coverage is 77.92%.

@@               Coverage Diff               @@
##           development    #2241      +/-   ##
===============================================
+ Coverage        59.52%   59.82%   +0.29%     
===============================================
  Files              210      212       +2     
  Lines            27540    27666     +126     
===============================================
+ Hits             16394    16552     +158     
+ Misses            9447     9417      -30     
+ Partials          1699     1697       -2

Flag	Coverage Δ
unit-tests	`59.82% <77.92%> (+0.29%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
dot/network/config.go	`63.01% <ø> (ø)`
dot/node.go	`55.71% <4.34%> (-1.86%)`	⬇️
dot/network/service.go	`56.17% <4.54%> (+1.38%)`	⬆️
lib/grandpa/grandpa.go	`59.80% <40.00%> (-0.34%)`	⬇️
dot/state/block.go	`44.92% <62.50%> (-0.07%)`	⬇️
internal/metrics/metrics.go	`68.88% <68.88%> (ø)`
dot/sync/chain_sync.go	`54.84% <83.33%> (+1.15%)`	⬆️
dot/state/storage.go	`52.06% <93.33%> (-2.35%)`	⬇️
dot/peerset/peerset.go	`49.42% <100.00%> (+0.23%)`	⬆️
dot/peerset/peerstate.go	`76.05% <100.00%> (+0.10%)`	⬆️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e816e31...73a98ae. Read the comment docs.

qdm12 · 2022-02-07T14:52:08Z

dot/peerset/peerset.go

@@ -282,6 +291,7 @@ func (ps *PeerSet) updateTime() error {
 			after := reputationTick(before)
 			n.setReputation(after)
 			ps.peerState.nodes[peerID] = n
+			reputationGauge.WithLabelValues(peerID.Pretty()).Set(float64(after))


That will potentially blow up memory usage of the Prometheus server since peerID is rather random. Maybe leave this as a log or find another way to report the metric with a lower cardinality / bounded label.

See the caution note in yellow at https://prometheus.io/docs/practices/naming/

Ideally I only want to label this on the devnets where the peers are bound to only like 3-4 nodes. I'll refactor this PR to make this configurable.

Maybe just have it as a periodic log? 🤔 Like our sync benchmarker?

Or you could have a mapping from [1 to 50] to peerIDs, use the [1-50] set for the metrics labels and log out the mapping at program start. Not totally ideal either, but that would work.

qdm12 · 2022-02-07T14:53:14Z

dot/peerset/peerset.go

@@ -334,6 +344,7 @@ func (ps *PeerSet) reportPeer(change ReputationChange, peers ...peer.ID) error {
 		for i := 0; i < setLen; i++ {
 			if ps.peerState.peerStatus(i, pid) == connectedPeer {
 				// disconnect peer
+				bannedCounter.WithLabelValues(pid.Pretty()).Inc()


Same problem here

timwu20 requested review from EclesioMeloJunior, edwardmack, jimjbrettj, kishansagathiya, noot and qdm12 as code owners January 27, 2022 16:21

timwu20 changed the title ~~instrument peerset with reputation and number of bans~~ feat(dot/peerset): Instrument peerset with peer reputation and number of bans Jan 27, 2022

EclesioMeloJunior approved these changes Jan 27, 2022

View reviewed changes

Base automatically changed from tim/metrics-refactor to development January 28, 2022 18:51

instrument peerset with reputation and number of bans

73a98ae

timwu20 force-pushed the tim/peerset-metrics branch from 1fd5e6d to 73a98ae Compare January 28, 2022 18:55

edwardmack approved these changes Feb 2, 2022

View reviewed changes

kishansagathiya approved these changes Feb 3, 2022

View reviewed changes

noot approved these changes Feb 3, 2022

View reviewed changes

qdm12 suggested changes Feb 7, 2022

View reviewed changes

timwu20 marked this pull request as draft February 16, 2022 18:23

jimjbrettj approved these changes May 25, 2022

View reviewed changes

timwu20 closed this Oct 5, 2022

EclesioMeloJunior deleted the tim/peerset-metrics branch October 5, 2022 11:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dot/peerset): Instrument peerset with peer reputation and number of bans #2241

feat(dot/peerset): Instrument peerset with peer reputation and number of bans #2241

timwu20 commented Jan 27, 2022 •

edited

Loading

EclesioMeloJunior left a comment •

edited

Loading

codecov bot commented Jan 27, 2022 •

edited

Loading

qdm12 Feb 7, 2022 •

edited

Loading

timwu20 Feb 10, 2022

qdm12 Feb 15, 2022

qdm12 Feb 7, 2022

feat(dot/peerset): Instrument peerset with peer reputation and number of bans #2241

feat(dot/peerset): Instrument peerset with peer reputation and number of bans #2241

Conversation

timwu20 commented Jan 27, 2022 • edited Loading

Changes

Tests

Issues

Primary Reviewer

EclesioMeloJunior left a comment • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jan 27, 2022 • edited Loading

Codecov Report

qdm12 Feb 7, 2022 • edited Loading

Choose a reason for hiding this comment

timwu20 Feb 10, 2022

Choose a reason for hiding this comment

qdm12 Feb 15, 2022

Choose a reason for hiding this comment

qdm12 Feb 7, 2022

Choose a reason for hiding this comment

timwu20 commented Jan 27, 2022 •

edited

Loading

EclesioMeloJunior left a comment •

edited

Loading

codecov bot commented Jan 27, 2022 •

edited

Loading

qdm12 Feb 7, 2022 •

edited

Loading