-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive resource utilization when ethstats reporting is enabled #3896
Comments
In the ethstats enabled scenario, validators are starting to struggle to parse/produce blocks on time and this results in lots of benign AuRa reports |
I forgot to mention the VM orientation shown in my screenshots:
|
@nicexe Did you try that with v1.12.3? |
Hi @varasev thanks for the reply.
I'll try keeping all nodes to latest except one where I would configure it to use 1.12.3 and sync from 0 |
@varasev 1.12.3 definitely helped I restored the archive node and validators from 1 to 4 and configured them to use 1.12.6. I removed all data for validator 5 and configured it to use 1.12.3. After validator 5 synced with the rest of the nodes I let it rest for a few minutes and then checked the system load as seen in the screenshot. Validator 5 has a system load average of about 1 on a 4 core VM and the other nodes have a system load average of about 8-10 on a 4 core VM each. Again, the only outlier is validator 1 which has significantly lower CPU utilization and system load average of about 2 on a 4 core VM but it is also the only node reporting invalid stats to the ethstat server. |
Also pushed #3906 |
I tried to use |
@LukaszRozmej Can you please make a docker image for this? |
@nicexe I fail to actually reproduce this: Would you be able to provide a performance trace from one of the misbehaving nodes? I am inquiring our devops on how to do that on nodes running inside docker as that is more complicated. @matilote would you be able to provide a docker image + instructions on how to use dottrace here? |
@nicexe can you help us gather data here?
|
@nicexe we found one potential issue with our ethstats integration. It could be connected with your problem. We will release the fix soon. The fix is merged to master |
Describe the bug
Nethermind is using excessive resources
To Reproduce
Steps to reproduce the behavior:
./run_all.sh
), segregate the nodes into a different server for each one.docker-compose.yml
and updateNETHERMIND_NETWORKCONFIG_EXTERNALIP
andNETHERMIND_ETHSTATSCONFIG_SERVER
environment variablesrun_all.sh
andstop_all.sh
to only run/stop the specific node or ethstats for each servercd .. # go back to the `compose` directory
anddocker-compose start && docker-compose logs -f
docker-compose.yml
and updateNETHERMIND_ETHSTATSCONFIG_ENABLED
to falseScreenshots
Here you can see some system metrics with ethstats reporting enabled
Focus on system load average
The only outlier is validator1 which seems to report the wrong information yet has the least resource utilization
Here you can see some system metrics with ethstats reporting disabled
System Information:
Additional notes
I cannot really explain the discrepancy between validator1 and the other validators in the ethstats enabled scenario.
All validators have almost identical environment variables set. They only differ on node name, keys and external ip.
Each node lives on it own VM, each on a different host machine, some host machines on a different datacenters.
All VMs have identical specs.
The text was updated successfully, but these errors were encountered: