downloader: Lock-up during sync due to circular return logic #16539
Description
EDIT: Original title: Lock-up during initial sync
EDIT: See this comment for current best guess on cause.
System information
Geth version: v1.8.4-stable-2423ae01/linux-amd64/go1.10
(installed via Ubuntu PPA package)
OS & Version: Ubuntu 16.04.4 LTS (Xenial Xerus)
Machine: KVM VPS
% uname -a
Linux <hostname> 4.4.0-119-generic #143-Ubuntu SMP Mon Apr 2 16:08:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Expected behaviour
Continuous fast-sync.
Running via systemd
with:
/usr/bin/geth --pprof --metrics --datadir /home/geth/.ethereum --cache 4096 --txpool.pricelimit 31337000 --syncmode fast --ethstats "veox-geth-lightserv-new-RESYNC:$SECRET@ethstats.net"
Actual behaviour
After seemingly-normal operation, and dropping off "stalling" peers once in a while, non-debug log output stops, as shown in this log tail
.
At this point, in console
using geth attach
:
> eth.syncing
{
currentBlock: 4460991,
highestBlock: 5402762,
knownStates: 15834634,
pulledStates: 15823975,
startingBlock: 4327715
}
> admin.peers.length
25
Setting
> debug.vmodule("p2p=4,downloader=4")
results in
DEBUG[04-20|15:06:23] Recalculated downloader QoS values rtt=5.195478857s confidence=1.000 ttl=15.586452156s
being printed repeatedly (the time changes - as expected; rtt
/ttl
values don't).
Forcibly disconnecting a peer using admin.removePeer("<enode>")
works, a new peer is selected from the pool. In other words: p2p
still works fine(-ish?).
Steps to reproduce the behaviour
Not sure; this is possibly related to networking conditions on the machine.
Happens anywhere between 5 minutes and 1 hour after starting the node.
Rambling
If I had to hazard a guess, I'd say the node corners itself into selecting peers so fast, that a small traffic spike on the VPS tower makes them all look just slow enough to be dropped.
After that, either the downloader
fails to realise the sync-peers are no longer there; QoS fails at hysteresis; all remaining peers are malicious; or something of the sort.
Backtrace
See gist.