[WIP] feat: improve how compact block filters work#774
[WIP] feat: improve how compact block filters work#774Davidson-Souza wants to merge 2 commits intogetfloresta:masterfrom
Conversation
This commit removes the old compact block filters struct, and adds a new filter headers storage, that only keeps filter headers and a position. The position will be used by a cache, that will hold the last few filters, to allow efficient rescans near the tip. Older filters will be dropped and need to be downloaded on rescans.
This commit removes the compact block filters storage reference in floresta-electrum and the jsonrpc. For now on, rescan will be done by the `UtreexoNode`, and users should use a `NodeHandle` to request one. It also implements the basic message flow in floresta wire to download and store filter headers. We still don't checkpoint them or verify against any list of known heders.
|
Thanks @Davidson-Souza, I'll review this one over the week |
| } | ||
|
|
||
| /// A store for filter headers, allowing insertion and retrieval by block height. | ||
| pub trait FilterHeadersStore { |
There was a problem hiding this comment.
Regarding what we've been talking about inter-crate dependency, I've been thinking about how to organize the relationship between wire and filters. Currently, wire depends on filters, so it can download filters from the network and store them. filters is then used to rescan and you use NodeInterface to request stuff. The major problem with today's approach is code duplication. Unless we make filters depend on wire to access NodeInterface, we can't ask for blocks from filters.
The approach taken here moves rescan logic to wire, behind the NodeHandle. That's arguably bad, since we move something that's not related to wire into it. @csgui is proposing a new "business" crate with all our important traits. That would be an universal dependency, but other crates wouldn't need to depend on each other, since they would only interact within consumer's code. This is close to what's proposed in #717 for chain, but for all the main business traits, and using a new crate (we would leave common as for types and methods). If we do this, we can move this FilterHeaderStore trait into this new crate and remove wire's dependency on filters.
If we also trait our the node interface, we could then move all rescan code to filters and assume there's some object that can give us blocks (bonus point, our filters code would work even without wire 😮). Therefore, none of the crates would depend on each other, greatly simplifying maintainability and code quality.
I'm really leaning towards this new approach, but would like to get some thoughts about it.
Cc. @csgui @luisschwab @JoseSK999 (you guys are looking into wire and filters the most)
There was a problem hiding this comment.
I agree with this approach 👍
There was a problem hiding this comment.
So even wire would not depend on chain ideally? Any strong reason why this needs to be separated from common?
filters one sounds good to me, we then “assemble” filters and wire in floresta-node.
There was a problem hiding this comment.
So even wire would not depend on chain ideally?
Yes, the "low level" crates (that implements one specific component) would not depend on each other.
Any strong reason why this needs to be separated from common?
Not strong, but it seems like abstract interfaces and reusable definitions are distinctict enough for deserving being in their respective crates.
filters one sounds good to me, we then “assemble” filters and wire in floresta-node.
Yes, those crates only interact inside things like node and electrum.
There was a problem hiding this comment.
We could name that crate as floresta-traits
There was a problem hiding this comment.
We could name that crate as
floresta-traits
What you think about floresta-domain? There can also live structs, enums and not only traits.
There was a problem hiding this comment.
So even
wirewould not depend onchainideally?
Since they are in the same architecture level (core layer) they can talk with each other. But ideally they should be isolated.
There was a problem hiding this comment.
Any strong reason why this needs to be separated from
common?
Probably common is a good place to have only data types, error and result types shared across layers. And also some helper functions.
There was a problem hiding this comment.
What you think about floresta-domain? There can also live structs, enums and not only traits.
Yep, makes more sense than floresta-traits
There was a problem hiding this comment.
Not strong, but it seems like abstract interfaces and reusable definitions are distinctict enough for deserving being in their respective crates.
my fifty cents for this, from floresta-common:
//! # Floresta Common
//! Provides utility functions, macros and modules to be
//! used in other Floresta crates.
Its generic enough to handle common types and methods trough floresta. Also, made a repurpose for it ? Is a bit underused yk...
Also, "floresta-domain" is nice, it made me remember of DDD maybe we could bring up a structure like /chain /wire /rpc for that... "If it needs to be used on more than one floresta-crate, should live in floresta-{common/dormain}"
|
The gold standard for me would be having a method on NodeInterface that allows passing a list of spk's and an optional start height and/or hash and have it return a list of relevant Blocks. I'm not sure about stopping to keep the filters though, perhaps that should be an option to the user? Maybe "keep last X filters", or "keep filters from height X up to tip"? I say this because the increased storage tradeoff may be worth it to some use cases. For example: recurrently importing old wallets would always require downloading all filters again. |
What if this is provided by
The trait has support for caching filters. It's not implemented here, but I agree we should give the option. |
Sounds good to me |
|
From the Floresta Dev Call: Instead of:
Do:
This allows a node to spin up and be able to sync a wallet almost instantly without having to finish IBD. There's a small correctness trade-off, since we can't authenticate the filters after AssumeUtreexo until we replace them with the |
Description and Notes
This PR changes how our Compact Block Filters logic, to make it more faster and reduce code duplication.
The current approach leaves all the rescan logic to the user, and requires code duplication. This change will leave all of that to
floresta-wire. Now you send a series of script pubkeys using the node handle, and it will give you back the blocks where filters match. This has several improvements like reducing duplication, improving reliability (since thefloresta-wirecan ask for the block again in case of timeout) and perfomance, sincefloresta-wirecan batch requests in an optimal way.Another thing that we are changing we is that rather than keeping all filters, we only keep the headers and then only download the actual filters when needed. This is how most implementation for BIP157/BIP158 works, and saves significant disk space. I've introduced a new filter headers store to store it.
We should download filters for blocks before
assumeutreexo, the latest header hash will be hard-coded to make sure our peers won't lie. I'm thinking about downloading as the last part ofChainSelection. After we catch-up with filters, we build the headers locally, from blocks we validate. If we don't useassumeutreexowe build them from scratch.One thing I still don't like is how we add an op-out to this: There's a feature-gate and the filter parameter to
UtreexoNodeis optional. I was thinking, what if I move the trait intofloresta-commonand leave the parameter as optional? That would reduce the amount of generic code, and still wouldn't requirefloresta-compact-filterson those who don't want to link against that crate.