-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix when key out of Boundary for MemoryAlignedDataMap #880
Conversation
Quality Gate passedIssues Measures |
@yagagagaga Thanks a lot, this is a good contribution. This part of the codebase really deserves more checks and tests. Regarding the HashMap, this is what I used initially. However, when dealing with very large datasets (several billion records), the cost of computing the hash code when accessing the HashMap is noticeable compared to the direct memory access of an array. The size of the array can easily be mitigated by increasing the size of segments. Unfortunately, I had to remove the JMH benchmarks due to license compatibility requirements (JMH is released under the GPL). I would keep the array for now and reintroduce some sort of benchmarks if we need to use another data structure To format the code, you need to execute |
Thanks for your reply, I had reformat the code and rollback the Memory.java. |
Thank you, could you elaborate a bit more on the computation of the upperBoundary? I'm probably missing something, but I would naivly implement it as follow, which does not give exacly the same result as your implementation. this.upperBoundary = ((long) memory.segmentSize()) * ((long) Integer.MAX_VALUE) / (long) dataType.size(); |
As all we known, Long is a 64-bits length data type with 63-bits represents the numerical value and 1-bit represents positive or negative.
If following your implement, some wrong will happened: // this.upperBoundary = ((long) Integer.MAX_VALUE) / (long) dataType.size() * ((long) memory.segmentSize());
var map = new MemoryAlignedDataMap<>(new IntegerDataType(), new OnHeapMemory(1024));
map.put(549755813887L, 1); // error will happen, but as the matter of fact, you can calculate segmentIndex and segmentOffset with this number
var map = new MemoryAlignedDataMap<>(new IntegerDataType(), new OnHeapMemory(1L << 34)); // if could be, the upperBoundary will numeric overflow |
Thank you for the detailed explanation and the fix. As said, this is typically the kind of contribution we need to improve the robustness of the project. Do not hesitate to tell us more about your use case and reach out if you have questions, it would be a pleasure to collaborate! |
Hi @bchapuis We can definitely revert this change (re-add the benchmarking code module). I believe we were just using its API without copying its source code into our project, and we won't be distributing the JMH-related jars in the binary version.
|
I looked into it, and sure enough, there's some discussion on this. https://issues.apache.org/jira/browse/LEGAL-399 :smi] |
@CalvinKirs Thanks a lot for this pointer. It's good to know that JMH can be used (I acted as a lumberjack in the previous release to ensure that we are compliant with the license ;) |
before: