-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter for a threshold value of peers removed that are delivering stale transaction or generally misbehave #28345
Comments
I have created bash script which is doing something similar as requested.
|
I think an issue with disconnecting from a peer who is delivering stale transactions is that they may not be stale yet to them if they are behind the chain for any reason and disconnecting could exasperate the problem. Sleeping a bit before engaging the peer further seems appropriate. |
I have similar issues. I see ~15K/hour Peer delivering stale transactions supposed warnings - (but I'm not able to fix anything) and ~25K/hour Announced transaction size mismatch messages I realize this is triggered by external factors but it seems excessive for something I cannot control |
@SjonHortensius thanks for the report. The log level is a slightly different issue than modifying the actual handling of the peer. I think downgrading the error could make sense. |
@SjonHortensius @lightclient The announces tx issue has been "fixed" on latest release. |
Hi @SjonHortensius can you tell me in which commit I can find the fix? I'm just curious :-) Based on the discussion, I understand that this behaviour is not a problem in itself? I had missed attestations in my validator and the error message indicated that the EL was not reachable via HTTP during this time. When I then looked at what Geth was doing during this time I noticed these error messages. Can these events in high numbers not affect the stability of Geth? Based on the code base I have seen that these arise from the otherreject counter. Isn't it possible that errors are accumulating here that should perhaps be handled separately. At the moment, all remaining errors that are either not txpool.ErrAlreadyKnown or txpool.**ErrUnderpriced** end up in otherreject.
I don't have exact knowledge about the possible error codes and I'm not an developer but as a user I see that Geth has phases where it can no longer be reached and during this time I see exactly these error messages. |
@drozdse1 you'll find the fix here: #28356 - as you can see in the commit message, these warnings are caused by Erigon sending bad tx announces. Geth complains when receiving those - but it canot drop the peer as it would cause too many invalid peers - hurting the network. By lowering the verbosity the Erigon team has time to fix the issue without Geth annoying their users too much |
The stale tx log was also lowered, should be in the next release. |
Rationale
After updating my client I see often the following entries in my Geth log file:
For now I removing the causing peer manually through the console:
admin.removePeer()
Implementation
It would be very helpful if there could be a parameter to specify the maximal number of misbehaviours per peer and if this is exceeded, the corresponding peer is automatically removed or blocked for a specific period.
The misbehaviour could, of course, be related to other typical problems that lead to other warnings in the log file or are generally a danger to the stability of the Geth application.
The text was updated successfully, but these errors were encountered: