Skip to content

Conversation

etan-status
Copy link
Contributor

We currently call onSlotEnd whenever all in-BN validator duties are completed. VC validator duties are not awaited. When onSlotEnd is processed close to the slot start, a VC may therefore miss duties. Adding a delay before onSlotEnd improves this situation. The logic can be optimized further if ActionTracker would track knownValidators from REST separately from in-process ones.

We currently call `onSlotEnd` whenever all in-BN validator duties are
completed. VC validator duties are not awaited. When `onSlotEnd` is
processed close to the slot start, a VC may therefore miss duties.
Adding a delay before `onSlotEnd` improves this situation.
The logic can be optimized further if `ActionTracker` would track
`knownValidators` from REST separately from in-process ones.
@tersec
Copy link
Contributor

tersec commented Jul 17, 2023

I'm not deeply opposed to changing the (unspecified meaning of) the on slot end messages, but it's not really per-se a problem/bug as such that it fires immediately on the end of BN duties. If the downstream code of that has problems, that's to some extent downstream code problems.

To the extent this PR practically removes latency spikes, sure, probably a net improvement, but the current slot end thing itself is merely a trigger.

The logic can be optimized further if ActionTracker would track knownValidators from REST separately from in-process ones.

Might not depend on ActionTracker -- there are other mechanisms already to check for in-BN validators, so REST validators == actiontracker validators without the in-BN ones.

@etan-status
Copy link
Contributor Author

If the downstream code of that has problems, that's to some extent downstream code problems.

Well, if a block is produced via REST, and during the await in the REST handler, the scheduler happens to decide to process the onSlotEnd, the block production can time out. Not really a VC issue (if that's what's meant with 'downstream code'), it sent the block production request right on slot start.

Might not depend on ActionTracker -- there are other mechanisms already to check for in-BN validators, so REST validators == actiontracker validators without the in-BN ones.

Yes, the set of validators could be determined like that, but it's still tricky to determine whether any of them have duties for the wall slot. We could use "actiontracker minus in-BN ones" to check whether there are any VC validators. Not sure if that's good though, as it retains an instance of BN-validators being treated separately from VC-validators.

On the flipside, we could also simply always wait, regardless of validators being attached or not. In the end, it doesn't seem to make too much sense conceptually to run the GC and pruning right at the start of a slot, while the next block is expected to be received any time soon.

@tersec
Copy link
Contributor

tersec commented Jul 17, 2023

Yeah, agree, the GC and pruning specifically should be tied to slot timing, not in-BN validators.

So, I'd suggest that's maybe another principled approach, to split out the on slot end processing from actual-within-slot-duties from the end-of-slot-Nim-GC-maintenance things which don't depend on duties (BN, VC, or otherwise).

@etan-status
Copy link
Contributor Author

Are there parts in onSlotEnd that are duty specific? To me, they all appear more as preparation work for the subsequent slot. And in the worst case, they all have to be fast enough to be done after completing all the duties (including initial propagation of aggregates, which is currently also not awaited for in-BN validators), so restricting the onSlotEnd tasks to the final half slot interval seems alright.

Probably something for @arnetheduck to comment on :)

@github-actions
Copy link

github-actions bot commented Jul 17, 2023

Unit Test Results

         9 files    1 074 suites   37m 57s ⏱️
  3 712 tests   3 433 ✔️ 279 💤 0
15 832 runs  15 527 ✔️ 305 💤 0

Results for commit b720c7d.

♻️ This comment has been updated with latest results.

@etan-status etan-status merged commit dd35d2d into unstable Jul 18, 2023
@etan-status etan-status deleted the dev/etan/vd-delayslotend branch July 18, 2023 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants