-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: epoch analysis tool #11343
feat: epoch analysis tool #11343
Conversation
tools/state-viewer/src/cli.rs
Outdated
pub struct EpochAnalysisCmd { | ||
/// The height of the epoch to analyze. | ||
#[clap(long)] | ||
height: EpochHeight, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be the start height of the epoch or any height in that epoch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start height of the epoch.
Thanks for sharing, I was indeed curious about this part (how to iterate over the epoch-related info) |
805ea61
to
b254d22
Compare
77b06fb
to
d571fc1
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #11343 +/- ##
==========================================
- Coverage 71.37% 71.28% -0.10%
==========================================
Files 787 787
Lines 159091 159299 +208
Branches 159091 159299 +208
==========================================
+ Hits 113554 113556 +2
- Misses 40624 40828 +204
- Partials 4913 4915 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I I want to understand the early return in get_initial_chunk_producer_assignment
before approving - see the comment please.
chain/epoch-manager/src/proposals.rs
Outdated
@@ -51,12 +51,12 @@ pub fn proposals_to_epoch_info( | |||
) -> Result<EpochInfo, EpochError> { | |||
// For this protocol feature, switch happened two epochs after protocol upgrade. | |||
// Keeping it this way for replayability. | |||
if checked_feature!( | |||
return if checked_feature!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop the return with the ;
at the end also.
chunk_producers.push(index); | ||
if used_chunk_producers.contains(&index) { | ||
return vec![vec![]; num_shards as usize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite confusing. Why do you return empty vectors if some index is repeated? If this was intended can you add a comment please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it was a leftover of implementation with some bug, removed.
tools/state-viewer/src/commands.rs
Outdated
let epoch_heights_to_validator_infos = | ||
BTreeMap::from_iter(epoch_heights_to_ids.iter().filter_map(|(epoch_height, epoch_id)| { | ||
if *epoch_height >= min_epoch_height | ||
&& *epoch_height <= max_epoch_height.saturating_sub(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I can actually put 2 there, made it more clear.
tools/state-viewer/src/commands.rs
Outdated
let next_next_epoch_id = epoch_heights_to_ids.get(&next_next_epoch_height).unwrap(); | ||
let epoch_summary = epoch_heights_to_validator_infos.get(epoch_height).unwrap(); | ||
|
||
next_epoch_info = match mode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fun way to implement where you have multiple of those checks :) I'm totally fine with it but I'm curious how did you arrive at this and how does it compare to having two methods, one for each mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was just bad code, how about now?
These two modes really share a lot of code. The difference is - for the first one, I always take data from DB. For the second, I override some data with previously generated info to simulate new algorithm behaviour. But the rest stays the same, so I don't want to separate impls completely.
daf6f2f
to
9e8ea27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tool to analyse epoch infos in two modes:
check-consistency
- regenerate next next epoch info based on two previous epochs and check that it matches the epoch info stored in DB;backtest
- regenerate epoch infos with existing proposals, rewards and kickouts as ifPROTOCOL_VERSION
was always in place.The
backtest
was used to estimate new algorithm for chunk producer shard assignments and showed that on average there is only one state sync happening, if we use start epoch height >= 545. Epoch info for 544 can't be retrieved for some reason, see #11477.Consistency check revealed that some epochs in the past can't be replayed, see #11476.
Expected output of it, note that epoch T generates epoch T+2: