Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

avoid key collision on child trie and proof on child trie #2209

Closed
wants to merge 142 commits into from

Conversation

cheme
Copy link
Contributor

@cheme cheme commented Apr 4, 2019

  • Initial issue

initial issue is that child trie key/value stored in rocksdb are all build the same way for all child trie,
therefore two identical child trie will get all their key duplicated and the pruning will break one of the two tries (only child trie with guaranty of not having same key/value content can currently be use with pruning).

  • Keyspace change

This PR solves the initial issue by prepending a unique id to the keyvalue db keys. It uses a HashDB implementation KeySpaceDB to prepend a unique trieid.

! with current implementation it is only prepended if PrefixedMemoryDB is use.
Long term design should move that keyspace information to the db layer (have keyvalue db that manage two level of collections: the heavy one and the light one).

Those prefix should also allow more efficient operation on child trie.

  • accessing all child trie keyvalue db key from rocksdb, allowing to skip what would otherwhise be a query of every nodes of the child trie for every of its states (blocks).

  • related to this access we can have efficient deletion.

  • related to this access we can export/import trie between chain (the prefix will need to be rewritten as it is only unique for a chain context).

  • Api change

This PR also contains a change of child trie api: do not do operation depending on storage_key but do operation depending on child_trie state.

This can allow efficient child trie usage (no need to query parent trie on every operation).

Recently I tried to split this PR in two (keyspace first, api second), but this api change is needed when a child trie does not exists (having the child trie in parameter makes things easier).

@@ -62,7 +63,11 @@ pub trait AccountDb<T: Trait> {
pub struct DirectAccountDb;
impl<T: Trait> AccountDb<T> for DirectAccountDb {
fn get_storage(&self, _account: &T::AccountId, trie_id: Option<&TrieId>, location: &StorageKey) -> Option<Vec<u8>> {
trie_id.and_then(|id| child::get_raw(id, location))
// TODO pass optional SubTrie or change def to use subtrie (put the subtrie in cache (rc one of
Copy link
Contributor Author

@cheme cheme Apr 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use account_db trie_id (see other comments) as keyspace (and as subtrie parent location).
So there is a need for caching SubTrie struct. TODO issue for that ?
There is a possible optimization by putting SubTrie directly at Account_id location, this would only be possible by:

  • storing account infos at the same location as subtrie (in subtrie prefix)
  • changing this pr to allow storing subtrie at any location
    For both point it requires implementing a way to store subtrie with different encoder (and with additional info).

@thiolliere maybe this pr (not sure it will get merged there might be better way of fixing the collision), could be of interest regarding #1882 or #1883 .

Copy link
Contributor

@gui1117 gui1117 Apr 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is a need for caching SubTrie struct. TODO issue for that ?

hmm I can also think of another anwser. The trie_id: Option<&TrieId> is already an optimisation that says: if there is a trie_id in direct_storage then you have to give me that I won't do the lookup from AccountId, so you can cache it for me.
This thoses changes you introduce we can just say: if there is a subtrie associated to the account already then give it I won't do the lookup.

Having subtrie in AccoundInfo seems cool though but not mandatory here.

@thiolliere maybe this pr (not sure it will get merged there might be better way of fixing the collision), could be of interest regarding 1882 or 1883 .

I don't know in which context this collision can happen I design things without taking this into consideration though. But interesting thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see your optim, it is better design (that is the reason why I pinged you :)

I don't in which context this collision can happen I design things without taking this into consideration though. But interesting thanks

without this pr it happens whenever two subtries are similar and content got pruned (for contract it should happen a lot). It only requires that both subtries got same branch path (also content (same proof would be more correct)) to the deleted value.

Having subtrie in AccoundInfo seems cool though but not mandatory here.

As long as we use a trie_id as subtrie key from account_id (merging account info and trie info) there is no need to take this into consideration.

use heapsize::HeapSizeOf;
use primitives::subtrie::{KeySpace, SubTrie};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend struct renamed as MapTransaction and VecTransaction are really suitable with SubTrie struct: especially the usage of Option<SubTrie> seems awkward at some points).

@cheme cheme added the A3-in_progress Pull request is in progress. No review needed at this stage. label Apr 4, 2019
@cheme cheme removed the A3-in_progress Pull request is in progress. No review needed at this stage. label Apr 5, 2019
cheme added 6 commits August 8, 2019 22:56
- use keyspace instead of storage key (situation before was an issue
and gave the wrong impression). Subscription rpc for child does not
exists but shall use storage_key (we may keep keyspace internally).
- add children change to change set (probably break subscription rpc
format).
@cheme cheme added A4-gotissues and removed A0-please_review Pull request needs code review. labels Sep 6, 2019
@cheme
Copy link
Contributor Author

cheme commented Sep 6, 2019

I did merge the pr with master.
There is an unsolve issue: actions on child trie (set_child_trie) do not manage their extrinsics.
It simply requires another extrinsics counter.

Also I did not reassert the logic of using keyspace either: there might be some place where storage key should be kept.

Seeing how stale this PR is, I will refer to my previous comment #2209 (comment) (see missing point) and conclude this is a wrong approach.

For instance having the technical keyspace in the state will make some rather complicated specification, when an implementation using reference counted keyvalue do not need it at all.

Therefore I am switching this PR to 'A4-Got issue' (could also be close (will require an issue creation first)) in favor of another approach (a bit more involved):

  • create an offstate storage, something similar to aux_storage but that is aligned with the block chain state (meaning that it needs to be change from overlay layer and needs pruning to: very similar to trie state).
    This offstate storage may have also some similarity with offchain local storage but it is unclear at this point.
  • make a pr
  • use similar keyspace as in this pr for child tries, but do not touch child api (keep using storage key only and
    manage some caching of storage_key -> keyspace in the new offstate storage).
  • make a pr
  • maybe try to get into some api change to avoid all those redundant query of child trie state by using a different api (notably splitting the child trie proof in two part as it is the case in this pr). At this point things could be on par with this pr (except there is no keyspace in child trie structure). + child trie structure could spawn from current description (not serialized and type of child trie contain in path).
  • make a pr

@gavofyork
Copy link
Member

@cheme please make an issue for it and close this when done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants