Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

allow staked nodes weight override #26870

Merged

Conversation

janlegner
Copy link
Contributor

Problem

As per issue #26407: The validator may want to prioritize QUIC traffic from a known whitelist of sources and apply its own stake to those transactions.

Summary of Changes

  • New parameter is added to the solana-validator program: --staked_nodes_overrides <PATH/URL>
  • A configuration file is periodically fetched from the given location (filesystem or HTTP)
  • The fetched configuration is applied to the staked nodes weight - effectively allowing whitelisting/prioritizing traffic from specified sources.

Config example

auto_reload: true
stake_map:
  1.1.1.1: 1000000000000000
  2.2.2.2: 4000000000000000

@mergify mergify bot added the community Community contribution label Aug 1, 2022
@mergify mergify bot requested a review from a team August 1, 2022 15:35
@janlegner janlegner changed the title Allowed staked nodes weight override allow staked nodes weight override Aug 1, 2022
@janlegner janlegner force-pushed the feat-staked-nodes-weight-override branch from fecbbc0 to 5a585f5 Compare August 1, 2022 17:40
@pgarg66
Copy link
Contributor

pgarg66 commented Aug 1, 2022

It might be more useful to have overrides for pubkey -> stake mapping (instead of IP address -> stake). Most known connections will be from peer nodes that have active stake (e.g. previous leaders, or soon to be leaders). If the peer is staked, it's identity pubkey is being used to lookup its stake.

@janlegner
Copy link
Contributor Author

janlegner commented Aug 2, 2022

It might be more useful to have overrides for pubkey -> stake mapping (instead of IP address -> stake). Most known connections will be from peer nodes that have active stake (e.g. previous leaders, or soon to be leaders). If the peer is staked, it's identity pubkey is being used to lookup its stake.

That is a good idea - but what about e.g. RPC nodes without vote pubkey? 🤔 Maybe allowing both IP and pubkey would be a good idea?

EDIT: or identity pubkey?

@mvines
Copy link
Contributor

mvines commented Aug 2, 2022

I don't like the idea of a new service within the validator that tries to keep this list updated. Instead we only need a couple primitives:

  1. New solana-validator command-line argument to provide the data file
  2. A new admin ipc method to signal a reload of the data file (cc:
    pub trait AdminRpc {
    )

@janlegner
Copy link
Contributor Author

I don't like the idea of a new service within the validator that tries to keep this list updated. Instead we only need a couple primitives:

  1. New solana-validator command-line argument to provide the data file
  2. A new admin ipc method to signal a reload of the data file (cc:
    pub trait AdminRpc {

    )

Thanks for the suggestion, I did not think about using admin signals but that makes a lot of sense. Where do you think we should then move the logic loading the configuration from file/URL? Keeping it in core/src/... seems strange. I do not want to inflate validator/src/main.rs too much, so maybe validator/src/lib.rs?

@mvines
Copy link
Contributor

mvines commented Aug 3, 2022

I do not want to inflate validator/src/main.rs too much, so maybe validator/src/lib.rs?

Yep that seems great. Putting it in lib.rs will also make it easier to add to solana-test-validator if somebody wants the functionality added there as well one day

@ochaloup ochaloup force-pushed the feat-staked-nodes-weight-override branch 5 times, most recently from 021ec77 to ab651bc Compare August 9, 2022 15:02
@ochaloup
Copy link
Contributor

ochaloup commented Aug 9, 2022

I did an update behalf of @janlegner on this PR. The yaml configuration is loaded via RPC admin command or as a configuration option at the start of the validator program. The configuration works with the node identities now (/cc @pgarg66) instead of the IPs.

@mvines would you be so kind and provide me a feedback on this PR?

@ochaloup ochaloup force-pushed the feat-staked-nodes-weight-override branch from ab651bc to 5010b79 Compare August 10, 2022 06:35
Copy link
Contributor

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice progress!

Comment on lines 462 to 464
if let Ok(url) = Url::parse(path_or_url) {
return try_config_from_url(url);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is to remove the URL support entirely. Local files only. It's trivial for the caller to wget the file, or use whatever other standard url fetching tooling they prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I changed for using only the path as you recommended.

.long("staked-nodes-overrides")
.value_name("FILE_OR_URL")
.takes_value(true)
.help("Provide file with custom overrides for stakes of specific IPs."), // TODO: change IP to IP or identity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to include some more info in the .help() about what impact this overrides may have for the validator operationally, either as an inline or a call out to external documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the more instructive description. I will be glad if you can comment on the way I did.

I searched for the documentation page that the description could be elaborated (not sure if it's the way to go now) but I haven't found any CLI reference guide where I could add that to. The solana cli has got the reference page that seems to be generated from the code (https://docs.solana.com/cli/usage#solana-cli). The solana-validator has got no reference guide. Should that added to some specific guide with even more details?

@@ -21,18 +22,40 @@ pub struct StakedNodesUpdaterService {
thread_hdl: JoinHandle<()>,
}

#[derive(Default, Deserialize, Clone)]
pub struct StakedNodesOverrides {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love how the placement of this struct here causes a number of extra downstream use of the staked_nodes_updater_service module, which really feels like it ought to be private to solana-core. Likewise, it necessitates a new serde_yaml dep for core.

Any way we can move the serde_yaml dep fully into validator/ instead?

Also are these files going to be large? It's been my experience that serde_yaml is very slow on large files (relative to serde_json)

Copy link
Contributor

@ochaloup ochaloup Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The declaration was used to easily load various data from the struct. As it's the validator module that needs to provide the data to the core service I haven't thought an easy way here.

The struct could be defined without the serde_yaml dependency but then it still means multiple extra downstream use.
I haven't thought anything super clever and I eventually pass the HashMap as the shared structure for the data passing. I don't like the fact that any change of the format or adding types for overrides (e.g. the discussed IP) means a bit more changes.

Even though would be fine to do it the way I did?

About the serde_yaml. We like the yaml as it's easy to work with and the configuration file is not expected to be large. So I left there the loading from yaml for now. If you consider it should really be the json, I can change it.

@ochaloup ochaloup force-pushed the feat-staked-nodes-weight-override branch from 5010b79 to d513e25 Compare August 11, 2022 12:44
@ochaloup
Copy link
Contributor

@mvines thank you for the review. I hope I addressed most of the points (details in the comments above).

@ochaloup
Copy link
Contributor

ochaloup commented Aug 11, 2022

@mvines @pgarg66 As browsing through the code I would like to check with you one detail.

The goal of this PR is to override the stake amounts (as per pr#26407) to make a validator peer to be considered with some amount of the stake and then from that calculate QUIC streams and packet vote stage.
I did it now in way that the mapping is taken from the (https://github.com/solana-labs/solana/blob/v1.11.5/core/src/staked_nodes_updater_service.rs#L75) tvu_peers() and overrde is applied only if the validator identity is valid within the TVU ips. I think that this could be limiting. When the validator happens to be a valid TVU peer? For the QUIC stake overriding purpose of ip/id could I use rather the cluster_info.all_peers()?

Btw. could be two validators running with the same IP? Could be then the ip_to_stake map in clash in values (https://github.com/solana-labs/solana/blob/v1.11.5/core/src/staked_nodes_updater_service.rs#L82)?

@pgarg66
Copy link
Contributor

pgarg66 commented Aug 11, 2022

@mvines @pgarg66 As browsing through the code I would like to check with you one detail.

The goal of this PR is to override the stake amounts (as per pr#26407) to make a validator peer to be considered with some amount of the stake and then from that calculate QUIC streams and packet vote stage. I did it now in way that the mapping is taken from the (https://github.com/solana-labs/solana/blob/v1.11.5/core/src/staked_nodes_updater_service.rs#L75) tvu_peers() and overrde is applied only if the validator identity is valid within the TVU ips. I think that this could be limiting. When the validator happens to be a valid TVU peer? For the QUIC stake overriding purpose of ip/id could I use rather the cluster_info.all_peers()?

It makes sense to use all_peers(). It'll allow a validator to allocate more QUIC streams to a node that's not a staked validator (e.g. an RPC node, or a client).

Btw. could be two validators running with the same IP? Could be then the ip_to_stake map in clash in values (https://github.com/solana-labs/solana/blob/v1.11.5/core/src/staked_nodes_updater_service.rs#L82)?

It's technically possible for two validators, e.g. behind NAT, using the same IP address. For this reason, we are not using ip_stake_map in QUIC. We have a mechanism to get peer's identity pubkey during QUIC connection setup process, and that's used for looking up id_to_stake (pubkey_stake_map) table.

@ochaloup
Copy link
Contributor

ochaloup commented Aug 11, 2022

It makes sense to use all_peers(). It'll allow a validator to allocate more QUIC streams to a node that's not a staked validator (e.g. an RPC node, or a client).

I see. Thank you @pgarg66 for this great elaboration! I changed the code for all_peers() as suggested (if you can check it would be nice).

@ochaloup ochaloup force-pushed the feat-staked-nodes-weight-override branch from d513e25 to 51d72de Compare August 11, 2022 16:58
@pgarg66
Copy link
Contributor

pgarg66 commented Aug 11, 2022

It makes sense to use all_peers(). It'll allow a validator to allocate more QUIC streams to a node that's not a staked validator (e.g. an RPC node, or a client).

I see. Thank you @pgarg66 for this great elaboration! I changed the code for all_peers() as suggested (if you can check it would be nice).

Thanks for updating the PR. Looks good to me. Let's wait for @mvines to review it too, before we merge it.

Copy link
Contributor

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@pgarg66 pgarg66 merged commit fc6cee9 into solana-labs:master Aug 11, 2022
xiangzhu70 pushed a commit to xiangzhu70/solana that referenced this pull request Aug 17, 2022
* Allowed staked nodes weight override (solana-labs#26407)

* Allowed staked nodes weight override, passing only HashMap over to core module

Co-authored-by: Ondra Chaloupka <chalda@chainkeepers.io>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
community Community contribution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants