-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoid storing full keys in leafs #10
Comments
I'm not sure if there is a good way to do this. The internal nodes will only store up to 10 bytes of the key, at which point they actually rely on the leaf to store the key. The problem is this, suppose you insert a few hundred keys with prefix "aaa....aaaa", and then you come along and insert "aaa{36}b.....bbbb". To determine the split point, we need the full key to determine the 37th character diverges. Since you can pick any arbitrary number for that, there will always be some limit to how much can be stored in the art nodes directly. That said, there is probably some rather complex de-duplication logic that can be used, but I don't currently have time to tackle that. |
I need insert about 1,000,000,000 node, and the size of ART will like B Tree and the height of tree is much more than B Tree. |
How many urls do you store and how much memory is used? |
We have an application where we store lots of URIs and map them to integer IDs. Since almost all URIs start with the same prefix, we decided to use the ART trie to do this mapping. Unfortunately, I realised that ART still keeps a complete copy of the key in its leaf nodes - wouldn't it be possible to dynamically reconstruct keys by traversing the trie from the root when needed (e.g. for the iteration)? In our case this would safe tremendous amounts of main memory.
The text was updated successfully, but these errors were encountered: