-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler forward packets #898
scheduler forward packets #898
Conversation
@@ -130,6 +134,10 @@ impl ForwardPacketBatchesByAccounts { | |||
pub fn iter_batches(&self) -> impl Iterator<Item = &ForwardBatch> { | |||
self.forward_batches.iter() | |||
} | |||
|
|||
pub fn take_iter_batches(self) -> impl Iterator<Item = ForwardBatch> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I send the packets from these ForwardBatch, created by scheduler, to workers. Instead of cloning the packets, just take ownership.
cc74da9
to
512de3e
Compare
512de3e
to
225da20
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #898 +/- ##
=========================================
- Coverage 81.9% 81.8% -0.1%
=========================================
Files 853 853
Lines 231779 231938 +159
=========================================
+ Hits 189829 189914 +85
- Misses 41950 42024 +74 |
Local cluster summary looks fine:
|
should_forward: bool, | ||
}, | ||
/// Only used during transition. | ||
Transitioning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a necessary hack for take
, only wish it is entirely internal
// If we hit the time limit. Drop everything that was not checked/processed. | ||
// If we cannot run these simple checks in time, then we cannot run them during | ||
// leader slot. | ||
if max_time_reached { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how's that 100 ms determined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did 1/4 of a block time; the checks done in this should be really quick, and if we cannot do these within 100ms we won't get to these packets in a scheduling loop which does more complex checks than these (currently).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of "we will do up to 100ms forwarding work, if there are still packets in container after that, they will be dropped" sounds a bit arbitrary. When I benching it, I have to change MAX_FORWARDING_DURATION
to MAX
. What do you think to stop processing forwarding when CU limit is hit (it already does that somewhat, just not perfect).
In any case, for this PR we worry about sending too many to next leader more than not sending enough. I think it is OK for now. Maybe keep an eye on number of forwarded packets vs dropped packets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - offline benchmarking on new forwarding function are inline with original scheduler's forwarder. It adds the needed functionality to new scheduler.
I made a note on small implementation detail, which is the part I am looking at to optimize, for both original scheduler and new one. We can merge this PR, then refactor Forwarder afterwards
// If we hit the time limit. Drop everything that was not checked/processed. | ||
// If we cannot run these simple checks in time, then we cannot run them during | ||
// leader slot. | ||
if max_time_reached { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of "we will do up to 100ms forwarding work, if there are still packets in container after that, they will be dropped" sounds a bit arbitrary. When I benching it, I have to change MAX_FORWARDING_DURATION
to MAX
. What do you think to stop processing forwarding when CU limit is hit (it already does that somewhat, just not perfect).
In any case, for this PR we worry about sending too many to next leader more than not sending enough. I think it is OK for now. Maybe keep an eye on number of forwarded packets vs dropped packets.
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |
(cherry picked from commit fb35f19) # Conflicts: # core/src/banking_stage/transaction_scheduler/prio_graph_scheduler.rs # core/src/banking_stage/transaction_scheduler/scheduler_controller.rs # core/src/banking_stage/transaction_scheduler/scheduler_metrics.rs # core/src/banking_stage/transaction_scheduler/transaction_state.rs # core/src/banking_stage/transaction_scheduler/transaction_state_container.rs # core/src/validator.rs
(cherry picked from commit fb35f19) # Conflicts: # core/src/banking_stage/transaction_scheduler/prio_graph_scheduler.rs # core/src/banking_stage/transaction_scheduler/scheduler_controller.rs # core/src/banking_stage/transaction_scheduler/scheduler_metrics.rs # core/src/banking_stage/transaction_scheduler/transaction_state.rs # core/src/banking_stage/transaction_scheduler/transaction_state_container.rs # core/src/validator.rs
Problem
Summary of Changes
Fixes #