-
Notifications
You must be signed in to change notification settings - Fork 280
refactor(core-p2p): Remove broken getRandomDownloadBlocksPeer() #2121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
getRandomDownloadBlocksPeer() would end up in an "infinite" recursion loop if the chain of the node is >10 blocks forked from the chain of all other nodes. This is because __getRecentBlockIds() returns just the highest 10 blocks of the node's chain. getRandomDownloadBlocksPeer() is not necessary with peer verification because during peer verification we check for common blocks in a more robust way (not just the highest 10), going back to the genesis block if necessary. So, for sure, if a peer is verified, we know that he has common blocks with us. If a peer could not be verified, then he would not be in the list of our peers. Thus, just use getRandomPeer() instead of getRandomDownloadBlocksPeer(). Also, improve some diagnostic messages to contain a reason why a failure occurred.
|
@air1one @faustbrian @supaiku0 - please review this in the next few days. Be sure to explicitly select labels so I know what's going on. If no reviewer appears after a week, a reminder will be sent out. |
|
Notice: this patch will re-enable peer verification |
Codecov Report
@@ Coverage Diff @@
## develop #2121 +/- ##
===========================================
- Coverage 79.35% 79.28% -0.07%
===========================================
Files 329 329
Lines 7889 7898 +9
Branches 1127 1105 -22
===========================================
+ Hits 6260 6262 +2
- Misses 1597 1604 +7
Partials 32 32
Continue to review full report at Codecov.
|
|
Looks good to me. |
spkjp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs some changes, I ran into a new issue on my node.
| const peerVerifier = new PeerVerifier(this); | ||
|
|
||
| if (deadline <= new Date().getTime()) { | ||
| throw new PeerPingTimeoutError(delay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When my node is trying to fetch a peer list, this exception is thrown and then it fails to get any new peers.
2019-02-20 03:06:31][INFO]: Fetching a fresh peer list from http://167.114.29.54:4002
[2019-02-20 03:06:33][DEBUG]: Request to http://167.114.29.54:4002/peer/blocks/common?ids=7650603120054377288,15579271131030939594,6475133262905089782,15240575364085420699,3955350625842386444,8592197812490173822,14213463318813299626,17826610844628864089,1106400051818429549 failed: timeout of 138ms exceeded
Happens everytime. If increase the ping timeout in getPeers it manages to get a peer list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message you posted does not originate from throw new PeerPingTimeoutError(delay), it comes from somewhere deeper - where we do try to access /peer/blocks/.... However, more importantly, I noticed that this ping() call from getPeers() is not necessary because the latter is only interested in /peer/list. As removing it is a separate, somewhat unrelated change, I created another PR for it: #2123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[2019-02-20 03:06:33][DEBUG]: Request to http://167.114.29.54:4002/peer/blocks/common?ids=7650603120054377288,15579271131030939594,6475133262905089782,15240575364085420699,3955350625842386444,8592197812490173822,14213463318813299626,17826610844628864089,1106400051818429549 failed: timeout of 138ms exceeded
[2019-02-20 03:06:33][INFO]: Peer verify 167.114.29.54: failure: could not get blocks starting from height 1566489 from peer: exception: timeout elapsed before successful completion of the verification
getPeers -> peer.ping() -> peerVerifier.checkState(body, deadline)) -> boom
The nsect calls probe which does hasCommonBlocks(Object.keys(probesHeightById), msRemaining) and msRemaining is reduced in each probe iteration... until it is really small (like 138ms) and then it is bound to fail.
The other PR removes ping from getPeers so this won't pose a problem in this scenario anymore. Though the behavior is still odd. 🤔
|
@vasild Your pull request needs some changes. Please wait for a comment from one of our developers for more information. |
getRandomDownloadBlocksPeer() would end up in an "infinite" recursion
loop if the chain of the node is >10 blocks forked from the chain of all
other nodes. This is because __getRecentBlockIds() returns just the
highest 10 blocks of the node's chain.
getRandomDownloadBlocksPeer() is not necessary with peer verification
because during peer verification we check for common blocks in a more
robust way (not just the highest 10), going back to the genesis block if
necessary. So, for sure, if a peer is verified, we know that he has
common blocks with us. If a peer could not be verified, then he would
not be in the list of our peers. Thus, just use getRandomPeer() instead
of getRandomDownloadBlocksPeer().
Also, improve some diagnostic messages to contain a reason why a
failure occurred.
Proposed changes
Types of changes
Checklist