-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/2725 #2746
Conversation
…is queried while it is in the process of being invalidated
…on error from the GetBlocksInv handler, reply a NACK instead of erroring out
…ase when choosing where to start scanning for new block downloads. Also, use the PeerNetwork::with_http() interface to access the inner HTTP peer when downloading blocks and microblocks.
…t before the reward cycle it reported that the local peer diverged. Also, record this sortition height for the block downloader's consumption.
…dicates that a network error can be resolved by simply re-trying the poll loop.
…nce can be passed to the HTTP handler code. Also, do a better job at propagating hints from the inventory state-machine to the downloader as part of #2730.
…ress #2738 by way of using cached data in the PeerNetwork to construct a /v2/info response without doing any I/O
…ge-podge of fields from it, so it can be passed into the HTTP request handler
…ock of the reward cycle in which the canonical Stacks chain state starts
…s, so /v2/info acts on the *current* chain view
@jcnelson Are there any tests that we should add as part of the release process in order to test this change? |
Yes -- per the blockchain engineering meeting, it's imperative that we spin up a release candidate from genesis on both mainnet and testnet and verify that it reaches the chain tip. |
@wileyj Want to take this branch for a spin? |
@jcnelson ready to second spin-up test with the commits above? |
Not yet -- still getting stuck (investigating) |
…isn't in the middle of processing blocks
…r that's diverged from us, start at either the reward cycle that contains the highest processed Stacks block, or INV_REWARD_CYCLES fewer reward cycles than the diverged reward cycle -- whichever is lower
… of either the highest processed Stacks block, or the inventory sortition height hint from the inv state machine -- whichever is lower
Okay, this time it managed to reach the testnet chain tip. Going to spin up from testnet genesis again to verify that it can do so without getting stuck even once. |
👍 spinning up our node from genesis now |
Nice work, @jcnelson . Thanks :). |
Okay, my node reached the chain tip on testnet and is in steady state. |
Both my testnet and mainnet nodes sync'ed with this patch, without issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but the allowed warning should be re-enabled.
As in 2a1e8f5? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jude
I was talking a look at this PR to review.
Even though I lack a lot of context, I was planning to just review based on the testing.
I was surprised to see that no tests were affected, or added.
Is it possible to add any unit tests that would exercise some of the new behavior? Perhaps this could be in a future PR.
Otherwise, can we document in the PR description how this was tested and why we think that is sufficient?
In terms of lines changed, the bulk of the delta in this PR comes from refactoring. Namely, we now wrap access to the HTTP network state with the new The truly new functionality -- which is simply concerned with making sure the block downloader will start downloading blocks at the right reward cycle -- is really only testable against a live system. You'd need a live Bitcoin node and a Stacks node that not only has all the blocks, but also will occasionally disconnect from bootstrapping nodes and render them in a state where they have no choice but to process reward cycles for which they don't have the anchor block. You'd also need to instrument the Stacks node so that it can be stopped and restarted while it is bootstrapping (clearing its inventory state), in order to verify that the node does not spend an unreasonable amount of time bootstrapping. Implementing such an integration test that faithfully simulates this condition would not only take more work than this PR, but also would be redundant -- this environment is exactly what the live testnet is supposed to provide. The release procedure explicitly calls for booting up a new node from genesis on both testnet and mainnet in order to give PRs like this their due testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcnelson If you think this has had the full test-suite run on it then LGTM.
The tests are failing is because bitcoin.org is down right now. However, it previously passed with the current patches with the exception of only |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This PR fixes a lot of issues underpinning #2725. Namely, it fixes #2701, #2728, #2730, and #2738 as well.
The gist of the changes is that we make it so that when the block inventory state-machine determines that the node is in the initial block download, and one of its bootstrap nodes NACK's its request for a
PoxInv
message or aBlocksInv
message at a given reward cycle, it will only refresh its view of the lastINV_REWARD_CYCLES
reward cycles from its bootstrap peers instead of all prior reward cycles. This situation arises more often than not -- for example, whenever the anchor block is "late" relative to the PoX sync watchdog -- which leads to a significant slow-down in synchronization as the node gets closer to the testnet chain tip (which has thousands of reward cycles). In addition, the changes here make it so that if this happens, the download state machine will begin scanning the local chain state from the sortition height at the start of the highest reward cycle synchronized with the bootstrap peer (instead of the very first sortition). This significantly speeds up block downloads, because like the inventory state machine, the download state machine only processes one reward cycle at a time (and making it start right where it's bound to find new blocks makes synchronization go faster towards the chain tip).The PR also fixes a few odds and ends, like making it so
/v2/info
doesn't require any I/O to complete and making it so the inventory state machine doesn't have to load the PoX bitvector on each pass (which is surprisingly slow).I'm in the process of testing this on a local testnet follower; will report back when it's ready.