-
Notifications
You must be signed in to change notification settings - Fork 37.3k
Introduce 'getblockfileinfo' RPC command #27770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
92aeec3
to
50a7755
Compare
Not sure. My preference would be to figure out why the blocks are of different size and then make that deterministic instead. |
I started there but.. do we really care? (aside from dev curiosity, where I agree that wouldn't be bad to know why). The test goal is to exercise I mean, all the hardcoded {'blocks_num': 249, 'lowest_block': 0, 'highest_block': 248, 'data_size': 65319, 'undo_size': 10168} At least with the I also think that (if we choose to go into this direction, can remove the remaining hardcoded prune heights in the test). |
We definitely care enough to investigate further, before adding a new RPC command to fix an intermittent issue in a functional test. Looks like @theStack has done that investigation here: #27749 (comment), and @mzumsande has suggested a potential alternative fix: #27749 (comment):
|
Yeah, I know. theStack investigation is awesome!. We kept ping-ponging about it after this PR was pushed :). And also martin's fix sounds good. If you don't like the first sentence in my previous comment remove it. It was just the intro, no hard feelings. The background rationale there still applies even with the investigation and the possible fix. The test is still fragile, the hardcoded Furthermore, what theStack analysis really says is that So, I actually think that adding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, what theStack analysis really says is that pruneblockchain result is not accurate. Which is the real issue for me. The RPC command result description says that it returns the "height of the last block pruned", which could not be true as blocks could have been stored out of order.
So retrieving block height 249 as "last block pruned" doesn't mean that block 248 has been pruned from disk.
But is it really incorrect that the "last block pruned" is 249? Yes, this doesn't imply that all blocks before have been pruned as well - but the RPC description isn't claiming that.
So, I actually think that adding the getblockfileinfo RPC command makes even more sense now.
Having this RPC could also make sense because of assumeutxo
, which could result in block files with a very mixed-up block order. Having a way to analyse this better seems like it could be helpful (ping @jamesob).
But I think that introducing any new RPC should primarily have some benefit to users, so that should be the main selling point - the use in tests should be more of a side benefit.
That would be the devil's advocate argument. It works for us, but not for our users.
Yeah, I agree. But also, I think there is some synergy between the functional tests and other projects building upon core's API as well. Sometimes commands that are useful for tests, not for end users, and also useful for other projects. I think that this one makes the manual pruning process a bit clearer. So there is no blind |
What about a one-line fix to clear the test issue and then a follow-up with the RPC, which may take longer to review than a trivial one-line patch? |
np for me. @mzumsande do you want to go ahead with your fix? I can just relabel this PR to "Implement getblockfileinfo RPC command" and done. |
yes, I'll open a PR with the one-line fix soon! |
I agree with @furszy and @mzumsande that a
For finding out where the block-file gaps come from, which rough block height ranges were stored in which of the different block files, having Though, I'm less convinced that typical users would have a strong need for that RPC, as the block-file structure is something internal that shouldn't be really relevant for typical use -- I doubt users running |
9fe9074 test: add block sync to getblockfrompeer.py (Martin Zumsande) Pull request description: This adds an additional `sync_blocks` call, fixing an intermittent error caused by blocks arriving out of order due to how compact block relay may revert to headers processing when the tip hasn't caught up, and resulting in slightly different pruning behavior. Making sure that all blocks from the previous tests are synced before generating more blocks makes this impossible. See #27749 (comment) and #27749 (comment) for a more detailed analysis. #27770 is a more long-term approach to avoid having to deal with magic pruneheight numbers in the first place, but that PR introduces a new RPC and needs more discussion. Fixes #27749. ACKs for top commit: MarcoFalke: lgtm ACK 9fe9074 theStack: ACK 9fe9074 Tree-SHA512: f3de1ea68725429aeef448c351ea812b805fa216912b112d7db9aceeddb1f2381b705c2577734b0d308e78ec5e0c4d26dc65fc2171f6e21f13061fc71d48216c
9fe9074 test: add block sync to getblockfrompeer.py (Martin Zumsande) Pull request description: This adds an additional `sync_blocks` call, fixing an intermittent error caused by blocks arriving out of order due to how compact block relay may revert to headers processing when the tip hasn't caught up, and resulting in slightly different pruning behavior. Making sure that all blocks from the previous tests are synced before generating more blocks makes this impossible. See bitcoin#27749 (comment) and bitcoin#27749 (comment) for a more detailed analysis. bitcoin#27770 is a more long-term approach to avoid having to deal with magic pruneheight numbers in the first place, but that PR introduces a new RPC and needs more discussion. Fixes bitcoin#27749. ACKs for top commit: MarcoFalke: lgtm ACK 9fe9074 theStack: ACK 9fe9074 Tree-SHA512: f3de1ea68725429aeef448c351ea812b805fa216912b112d7db9aceeddb1f2381b705c2577734b0d308e78ec5e0c4d26dc65fc2171f6e21f13061fc71d48216c
50a7755
to
5110139
Compare
title and description updated. |
FWIW, I was curious how hard it would be to implement The result seems to work, but is still kind of ugly and hacky. To get a parsed block's height (without flooding the node with hundreds of |
Simple command to retrieve information about a certain block file Github-Pull: bitcoin#27770 Rebased-From: f6fb9cd
Instead of hardcoding the `pruneblockchain(<height>)` heights, use 'getblockfileinfo' to obtain the highest block number of each of the block files. Making the test more robust and readable by stating which file is being pruned at every point of time (the goal is to mimic how the automatic pruning process work). Github-Pull: bitcoin#27770 Rebased-From: 5110139
Simple command to retrieve information about a certain block file
Instead of hardcoding the `pruneblockchain(<height>)` heights, use 'getblockfileinfo' to obtain the highest block number of each of the block files. Making the test more robust and readable by stating which file is being pruned at every point of time (the goal is to mimic how the automatic pruning process work).
5110139
to
5090771
Compare
Are you still working on this? |
I think this information is too low-level. In the past we've rejected changes that expose low-level information about our block storage because block files are not an external API, and exposing this kind of info limits our flexibility in regard to changing the block storage model. I agree that it can be useful to have more transparency into pruning, however, but I'm not sure adding a RPC giving information per file is the best approach. |
Conceptually, I agree. But I don't think that the block storage model flexibility argument is the best for not doing this. We do have procedures tightly coupled to the current block storage model (e.g. the best chain itself or pruning). Removing this new debug-only RPC command would be the least of our problems if we ever seek to change the storage model. That being said, no problem on closing this for now. I agree that making pruning more transparent can be orthogonal to the introduction of this command. |
Sure, that's true. In that sense it's different than say, adding block file/pos to |
Simple command to retrieve information about a certain block file Github-Pull: bitcoin#27770 Rebased-From: 24bfe4e
Instead of hardcoding the `pruneblockchain(<height>)` heights, use 'getblockfileinfo' to obtain the highest block number of each of the block files. Making the test more robust and readable by stating which file is being pruned at every point of time (the goal is to mimic how the automatic pruning process work). Github-Pull: bitcoin#27770 Rebased-From: 5090771
Implemented
getblockfileinfo
RPC command to obtain block files related data.The new RPC could be useful for:
Make manual pruning process more transparent: Currently, users blindly provide a height to
pruneblockchain
and only realizes up to which block the node was pruned from the RPC command result. This new introduced command lets the user know where to set thepruneblockchain
height by knowing the highest and lowest block heights stored on each file. So users will actually be able to remove specific block ranges (with some exceptions, as blocks can be stored out of order in different block files.. but that is a different topic).Make tests more robust and readable by removing magic numbers like the ones presented in
rpc_getblockfrompeer.py
(which are actually off by a good degree and not really document anywhere, see comment).And also for testing AssumeUTXO. Which, per mzumzande comment: it usage could result in block files with a very mixed-up block order. So having a way to analyse this better seems like it could be helpful. Also see a live scenario in theStack comment.