-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add threshold flexbility to start a level #34
Comments
Some comments (after new year's eve so bear with me :D ):
|
The idea is: |
Ok - it goes back to the "threshold per level" idea where we could pass up quickly to another level if we have "enough" signatures. type PostProcessor interface {
OnVerifiedSignature(sp *sigPair)
} interface. So a |
Not exactly, because we're trying to use a technical (a node that hasn't communicated since 1 second is likely dead) vs. a functional heuristic (on average we will have 80% of the nodes so let's not wait longer if we have enough sigs). The first could be use even if you would ideally want to have all sigs at all levels. This said we don't have to implement this now , especially ( :-) ) if we don't think it makes a difference. I will try in the simulator and report back. |
It's possible to simplify it with:
In practice, in the simulator it makes no difference for most scenarios.
WITH this new feature (all sigs) WITHOUT WITH (80% of the sigs) With different settings: WITHOUT WITH .80%: |
In other words, it's interesting. Need to think about it more. |
The 80% thing makes a huge difference, it's surprising... do you have "failed" nodes in your setup ?? Otherwise it's weird that it makes such a difference, since all signatures are supposed to be complete for a given level no ? On your first previous point, "if sig is complete for a level, then start the next one": @bkolad saw last week that handel took more time with |
Of course, if not this setting is more or less useless.
What I measured in the simulator is a little bit different (but there is an overlap between the two): |
That's more of less the current logic: very aggressive timeouts. You can see in the results above that the improvement is much more interesting with a timeout of 200ms than a timeout of 100ms. |
Ok good, a more static version of this is implemented right now in |
As discussed, let's try with a timeout of 50ms short term. |
Today we start a level only when it has all the signatures.
We could have something smarter when the missing signature comes from a node that should have communicated long ago.
Technically, we could have a module to identify suspicious nodes. If the missing sigs comes from a suspicious node we start the level. A node would become suspicious if it hasn't sent its signature after a given delay or of it hasn't responded when we use tcp or quic to communicate.
The text was updated successfully, but these errors were encountered: