-
Notifications
You must be signed in to change notification settings - Fork 741
Add SDK Sampling interface #1877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7cbd5bd
to
499a7f2
Compare
network/p2p/node_sampler.go
Outdated
type NodeSampler interface { | ||
// Sample returns at most [n] nodes. This may return fewer nodes if fewer | ||
// than [n] are available. | ||
Sample(n int) ([]ids.NodeID, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't Sample
take a context? You are already needing to stub out TODO getRecentValidatorIDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example of a sample that would return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't Sample take a context? You are already needing to stub out TODO getRecentValidatorIDs.
Yeah this makes sense, I can add this to the signature
Can you give an example of a sample that would return an error?
Hmm I guess an example would be some grpc failure? Some implementation might hit the p-chain for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the only reason we have this Sample
interface is to use it for the Client
node selection implementation. We could actually swallow this error and just return an empty slice of node ids, and have the Client
report an error if it sees no nodes to send to, so the error is still visible and this interface signature is much cleaner since we can get rid of the error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could actually swallow this error and just return an empty slice of node ids, and have the Client report an error if it sees no nodes to send to,
If we error and return on calls to v.validators.GetCurrentHeight(ctx)
or v.validators.GetValidatorSet(ctx, height, v.subnetID)
we won't update v.validatorIDs
so future sampling won't return an empty slice of node IDs, it'll just return stale node IDs. FWIW the sampler implementations in the sampler
package have errors in their return signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a unit test for this specific scenario to make sure we return an empty slice instead of stale ids so the error is surfaced correctly in the Client
.
network/p2p/node_sampler_test.go
Outdated
} | ||
|
||
for disconnected := range tt.disconnected { | ||
require.NoError(sampler.Disconnected(context.Background(), disconnected)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use context.WithCancel for the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum can you elaborate? I didn't understand why this would be helpful here.
_ NodeSampler = (*Peers)(nil) | ||
) | ||
|
||
type NodeSampler interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this NodeSampler
or PeerSampler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems more like a node sampler to me. It samples nodes (which may or may not be connected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah my thinking was that the terminology we use are:
node
: a member in the networkpeer
: a node we are connected tovalidator
: a staking node
so Peers
is a component that represents nodes are are connected to, and Validators
is a component that represents staking nodes.
Co-authored-by: Stephen Buttolph <stephen@avalabs.org> Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Co-authored-by: Stephen Buttolph <stephen@avalabs.org> Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
Co-authored-by: Stephen Buttolph <stephen@avalabs.org> Signed-off-by: Joshua Kim <20001595+joshua-kim@users.noreply.github.com>
type NodeSampler interface { | ||
// Sample returns at most [limit] nodes. This may return fewer nodes if | ||
// fewer than [limit] are available. | ||
Sample(ctx context.Context, limit int) []ids.NodeID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: I also considered making another type to return here like:
type Node struct {
ID ids.NodeID
}
so we could add onto it without causing breaking changes. I felt like it was reasonable that in the future we might add some peer-level metadata to this, but it seemed premature to add for now.
network/p2p/validators.go
Outdated
if err != nil { | ||
return | ||
} | ||
validatorSet, err := v.validators.GetValidatorSet(ctx, height, v.subnetID) | ||
if err != nil { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm is there something we could do other than swallow these errors? Adding error
return values seems kind of annoying but also I dislike swallowing errors. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it does seem strange. Here's a link to the rationale behind this: #1877 (comment)
@@ -41,6 +41,7 @@ type Client struct { | |||
handlerPrefix []byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not part of this PR but should we have a NewClient
function? If this in intended to be used only inside this package, Client
can be unexported. If it's intended to be used outside this package, then a user can't use Client
because they can't set its inner fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is created via RegisterAppProtocol
. It's not possible to use a Client
without a corresponding Router
because the Router
holds the map of pending requests -> the callbacks that need to be fired on response. We could theoretically refactor this into a NewClient(Router, handlerID, ....)
if we felt that looked cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just feels weird to export this if it's actually unusable outside this package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Client
is usable outside of this package, RegisterAppProtocol
initializes it here.
Maybe we could rename this to NewClient
to be more verbose or remove the receiver on Router
. This definitely might seem confusing coming from the existing paradigm of how we've developed VMs without the SDK component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Yeah I think it might be a bit confusing to someone looking at this package how to create a new client. I think maybe even just a comment on Client
might suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I could see an argument to make the constructor code NewClient(sender common.AppSender, router *Router, handlerID byte) *Client
. This would have the added benefit of not having to keep around the sender
as a field in Router
which we currently do.
Signed-off-by: Stephen Buttolph <stephen@avalabs.org>
Why this should be merged
Allow the SDK clients to configure their node sampling strategy for
AppRequestAny
.How this works
Moves the peer state in
Router
(which was questionable anyways) to an implementation of an interface. We now expose two different implementations, one to sample any peer and one to sample strictly validators. The VM is expected to keep a reference to a set of samplers, and re-use them for any clients it might create.How this was tested
Added some unit tests.