-
Notifications
You must be signed in to change notification settings - Fork 15k
[CAS] Add OnDiskTrieRawHashMap #114100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CAS] Add OnDiskTrieRawHashMap #114100
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Add OnDiskTrieRawHashMap. This is a on-disk persistent hash map that uses a Trie data structure that is similar to TrieRawHashMap. OnDiskTrieRawHashMap is thread safe and process safe. It is mostly lock free, except it internally coordinates cross process creation and closing using file lock. OnDiskTrieRawHashMap is used as the foundation to implement OnDisk CAS storage which maps hash to store data. Reviewers: Pull Request: #114100
3d10915 to
01195e5
Compare
Created using spr 1.3.6
| : Hash(Hash), Data(Data.begin(), Data.size()) {} | ||
|
|
||
| ArrayRef<uint8_t> Hash; | ||
| ArrayRef<char> Data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity — why is this using a signed char as storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What suggestions you have? It is quite natural as Data can be get as StringRef which is const char*, but I guess you are asking why Hash and Data are different types? I don't really have preference either way.
| printHexDigits(OS, Bytes, LastBinaryBit, HashEndBit - LastBinaryBit); | ||
| } | ||
|
|
||
| static void appendIndexBits(std::string &Prefix, size_t Index, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use a raw_string_stream here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly just bit manipulation with adding single char to string. I don't see using a stream is making it easier to read.
Created using spr 1.3.6
|
Ping |
Created using spr 1.3.7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this mostly looks fine. How much coverage are you getting out of the tests? It seems like it would be relatively low, given the implementation size. i'm not suggesting you need 100% coverage or anything like that, it just seems like there are very few tests for the amount of APIs in the header.
Also unclear on the OnDiskDataAllocator, and how it relates to the rest of this patch.
Let me make sure it covers most of the APIs. In downstream code, this is treated as an implementation details for the CAS, so I feel it is a bit wasteful to do extensive testing (like thread-safe/multi-process tests) on all layers. But I agree it should cover most of the APIs.
Going to split that up into a separate PR.
Are you suggesting just not building that when the feature is not enabled and trying to use the feature will result in link failure, instead of some error during runtime? I am ok to switch to that too. |
Created using spr 1.3.7
Created using spr 1.3.7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I just have the 1 Q on the format that I left in line, w.r.t the header layout.
Created using spr 1.3.7
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/154/builds/22097 Here is the relevant piece of the build log for the reference |
Fix: #161268 |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/21573 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/145/builds/9948 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/23593 Here is the relevant piece of the build log for the reference |
|
Fix for |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/39/builds/8066 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/76/builds/13031 Here is the relevant piece of the build log for the reference |
Add OnDiskTrieRawHashMap. This is a on-disk persistent hash map that uses a Trie data structure that is similar to TrieRawHashMap. OnDiskTrieRawHashMap is thread safe and process safe. It is mostly lock free, except it internally coordinates cross process creation and closing using file lock.
…lvm#161283) Link LLVM_PTHREAD_LIB from LLVMCAS library to fix rhel bots.
Add OnDiskTrieRawHashMap. This is a on-disk persistent hash map that
uses a Trie data structure that is similar to TrieRawHashMap.
OnDiskTrieRawHashMap is thread safe and process safe. It is mostly lock
free, except it internally coordinates cross process creation and
closing using file lock.