Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

AndreiEres
Copy link
Contributor

Fixes #7124

@AndreiEres AndreiEres added A3-in_progress Pull request is in progress. No review needed at this stage. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Jul 12, 2023
Copy link
Contributor

@sandreim sandreim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could also gum::trace a bounded number of candidate hashes that are unapproved but I'd do it only for depths > 10 in order to not spam the logs needlessly - see #7124 (comment)

@@ -160,6 +160,7 @@ struct MetricsInner {
time_db_transaction: prometheus::Histogram,
time_recover_and_approve: prometheus::Histogram,
candidate_signatures_requests_total: prometheus::Counter<prometheus::U64>,
unapproved_candidates_in_unfinalized_chain: prometheus::Counter<prometheus::U64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a GaugeVec. Keep in mind that approval checking might take between 1-2 relay chain blocks so this would just rise up each call via finality_target_with_longest_chain by GRANDPA making the metric useless for differentiating between normal operation and failure.

I suggest to have 2 labels for it:

  • count to capture situations when approval voting continuously fails to approve certain number of candidates resulting in this going up
  • depth to capture when just 1 single candidate was not approved while all rest are approved leading to a continuous increase of depth

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for depth, there already is polkadot_parachain_approval_checking_finality_lag

@AndreiEres AndreiEres added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Jul 20, 2023
@AndreiEres AndreiEres changed the title [WIP] Add counter for unapproved candidates Add counter for unapproved candidates Jul 20, 2023
@AndreiEres AndreiEres requested a review from sandreim July 21, 2023 13:14
@AndreiEres AndreiEres force-pushed the AndreiEres-7124-add-metric-to-count-unapproved-parachain-blocks-contained-in-unfinalized-chain branch from b8f864c to 74d2fd0 Compare July 25, 2023 10:20
"polkadot_parachain_approval_unapproved_candidates_in_unfinalized_chain",
"Number of unapproved candidates in unfinalized chain",
),
&["count", "depth"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is not what I meant. There should be one label, and it's value can be count or depth. I think it's better to have 2 separate Gauge in this case, one measuring depth of first unapproved and the total count of unapproved candidates.

@@ -1534,6 +1565,16 @@ async fn handle_approved_ancestor<Context>(
unapproved.len(),
entry.candidates().len(),
);
if bits.len() > LOGGING_DEPTH_THRESHOLD {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this would be excessive output. What's holding finality is basically the oldest unapproved candidates. Let's just print a the candidate hashes of the in the oldest unapproved block only.

@@ -1534,6 +1553,15 @@ async fn handle_approved_ancestor<Context>(
unapproved.len(),
entry.candidates().len(),
);
if i == oldest_ancestor_idx && bits.len() > LOGGING_DEPTH_THRESHOLD {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't print only up to the oldest LOGGING_DEPTH_THRESHOLD blocks.

Copy link
Contributor

@sandreim sandreim Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be all blocks starting with ancestry.len() + 1 - LOGGING_DEPTH_THRESHOLD

if bits.len() > LOGGING_DEPTH_THRESHOLD && i >= bits.len() - LOGGING_DEPTH_THRESHOLD {
gum::trace!(
target: LOG_TARGET,
"Unapproved blocks on depth {}: {:?}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also add the relay chain block hash in the trace.

node/core/approval-voting/src/lib.rs Outdated Show resolved Hide resolved
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
@vstakhov
Copy link
Contributor

bot merge

@paritytech-processbot paritytech-processbot bot merged commit c2efce0 into master Aug 10, 2023
@paritytech-processbot paritytech-processbot bot deleted the AndreiEres-7124-add-metric-to-count-unapproved-parachain-blocks-contained-in-unfinalized-chain branch August 10, 2023 11:17
s0me0ne-unkn0wn pushed a commit that referenced this pull request Aug 15, 2023
* Add counter for unapproved candidates

* Update metrics

* Split metrics

* Remove depth metric

* Print only the oldest unapproved candidates

* Update logging condition

* Fix logging condition

* Update logging

* Update node/core/approval-voting/src/lib.rs

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

---------

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add metric to count unapproved parachain blocks contained in unfinalized chain.
4 participants