-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add counter for unapproved candidates #7491
Add counter for unapproved candidates #7491
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could also gum::trace
a bounded number of candidate hashes that are unapproved but I'd do it only for depths > 10 in order to not spam the logs needlessly - see #7124 (comment)
node/core/approval-voting/src/lib.rs
Outdated
@@ -160,6 +160,7 @@ struct MetricsInner { | |||
time_db_transaction: prometheus::Histogram, | |||
time_recover_and_approve: prometheus::Histogram, | |||
candidate_signatures_requests_total: prometheus::Counter<prometheus::U64>, | |||
unapproved_candidates_in_unfinalized_chain: prometheus::Counter<prometheus::U64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a GaugeVec. Keep in mind that approval checking might take between 1-2 relay chain blocks so this would just rise up each call via finality_target_with_longest_chain
by GRANDPA making the metric useless for differentiating between normal operation and failure.
I suggest to have 2 labels for it:
count
to capture situations when approval voting continuously fails to approve certain number of candidates resulting in this going updepth
to capture when just 1 single candidate was not approved while all rest are approved leading to a continuous increase ofdepth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for depth, there already is polkadot_parachain_approval_checking_finality_lag
b8f864c
to
74d2fd0
Compare
node/core/approval-voting/src/lib.rs
Outdated
"polkadot_parachain_approval_unapproved_candidates_in_unfinalized_chain", | ||
"Number of unapproved candidates in unfinalized chain", | ||
), | ||
&["count", "depth"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is not what I meant. There should be one label
, and it's value can be count
or depth
. I think it's better to have 2 separate Gauge in this case, one measuring depth of first unapproved and the total count of unapproved candidates.
node/core/approval-voting/src/lib.rs
Outdated
@@ -1534,6 +1565,16 @@ async fn handle_approved_ancestor<Context>( | |||
unapproved.len(), | |||
entry.candidates().len(), | |||
); | |||
if bits.len() > LOGGING_DEPTH_THRESHOLD { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this would be excessive output. What's holding finality is basically the oldest unapproved candidates. Let's just print a the candidate hashes of the in the oldest unapproved block only.
node/core/approval-voting/src/lib.rs
Outdated
@@ -1534,6 +1553,15 @@ async fn handle_approved_ancestor<Context>( | |||
unapproved.len(), | |||
entry.candidates().len(), | |||
); | |||
if i == oldest_ancestor_idx && bits.len() > LOGGING_DEPTH_THRESHOLD { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't print only up to the oldest LOGGING_DEPTH_THRESHOLD
blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be all blocks starting with ancestry.len() + 1 - LOGGING_DEPTH_THRESHOLD
node/core/approval-voting/src/lib.rs
Outdated
if bits.len() > LOGGING_DEPTH_THRESHOLD && i >= bits.len() - LOGGING_DEPTH_THRESHOLD { | ||
gum::trace!( | ||
target: LOG_TARGET, | ||
"Unapproved blocks on depth {}: {:?}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also add the relay chain block hash in the trace.
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
bot merge |
* Add counter for unapproved candidates * Update metrics * Split metrics * Remove depth metric * Print only the oldest unapproved candidates * Update logging condition * Fix logging condition * Update logging * Update node/core/approval-voting/src/lib.rs Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com> --------- Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Fixes #7124