Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

check OS network limits when starting validator #20874

Merged
merged 4 commits into from
Oct 26, 2021

Conversation

jbiseda
Copy link
Contributor

@jbiseda jbiseda commented Oct 22, 2021

Problem

solana-sys-tuner can be used to increase linux network buffer sizes, but we do not check that these values have been applied.

Summary of Changes

Check that linux network buffer sizes have been increased in validator start. Checks can be bypassed with --no-os-network-limits-test.

Fixes #

let err_msg = "OS network limit test failed. If you wish to continue, try --no-os-network-limits-test";
error!("{}", err_msg);
eprintln!("{}", err_msg);
exit(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried this is too aggressive. How about emitting a datapoint instead of failing here, so we can see how many validators get snagged here first

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. If the metric is out of control we can remediate it down to a more comfortable level before adding the boot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to get an opinion on this. Changed to log a datapoint.

let err_msg = "OS network limit test failed. If you wish to continue, try --no-os-network-limits-test";
error!("{}", err_msg);
eprintln!("{}", err_msg);
exit(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. If the metric is out of control we can remediate it down to a more comfortable level before adding the boot

validator/src/main.rs Show resolved Hide resolved
@jbiseda jbiseda marked this pull request as ready for review October 23, 2021 05:12
@jbiseda jbiseda requested review from t-nelson and mvines October 23, 2021 05:12
mvines
mvines previously approved these changes Oct 23, 2021
@codecov
Copy link

codecov bot commented Oct 23, 2021

Codecov Report

Merging #20874 (cb7ac0d) into master (e03dc9e) will increase coverage by 0.1%.
The diff coverage is 92.6%.

@@            Coverage Diff            @@
##           master   #20874     +/-   ##
=========================================
+ Coverage    81.4%    81.5%   +0.1%     
=========================================
  Files         531      496     -35     
  Lines      141636   139627   -2009     
  Branches      295        0    -295     
=========================================
- Hits       115312   113928   -1384     
+ Misses      26224    25699    -525     
+ Partials      100        0    -100     

@mergify mergify bot dismissed mvines’s stale review October 25, 2021 21:05

Pull request has been modified.

@jbiseda jbiseda merged commit 6470560 into solana-labs:master Oct 26, 2021
@jbiseda jbiseda deleted the netbuf-check branch October 26, 2021 01:26
dankelleher pushed a commit to identity-com/solana that referenced this pull request Nov 24, 2021
frits-metalogix added a commit to identity-com/solana that referenced this pull request Nov 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants