-
Couldn't load subscription status.
- Fork 1.2k
Add readBytes method to RandomAccessInput #12600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi,
Write a test in TestMultiMmap. The other test is only about general setup. The "cool tests" using multiple segments or multiple buffers is there. |
|
I need more time to review this as I am a bit crowded. Sorry for possible delays. If you get tests running with all kinds of mmap, just give me some feedback. |
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java
Outdated
Show resolved
Hide resolved
|
I changed to draft until all is tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the code looks fine. To be safe, I will change the Policeman Jenkins Job temporarily to your branch.
I left some minor comments about method placement and a future simplification. In general, once we add Java 22 support (final version of Java Panama FFI API), I try to cleanup the older version.
At least in main (and possibly 9.x) we should remove Java 19 and Java 20 support (out of maintenance) and only keep Java 21+ for Panama Vector and Panama MMap.
lucene/core/src/java19/org/apache/lucene/store/MemorySegmentIndexInput.java
Outdated
Show resolved
Hide resolved
|
I changed the https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/ (Linux MMAP Job) to use your branch. Please wait a bit until the checker is happy, as it tries all different java versions. This gives us confidence. Except the MultiMMap test do we have other productive ode already using the new API? After merging just ping me to revert the branch settings in the policeman job - of course do not delete the branch! The first run using this branch is: 1287 -- https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/1287/ Uwe |
|
In general to me it is still questionable if we really need a bulk random access byte[] support in RandomaccessInput as this is not required by "core Lucene". I am partly agree with this, but if somebody asks for float[] or long[] bulk reads with positional access I will -1 that. So I leave it up to others to decide if this makes sense not only for DocValues but possibly also other code in Lucene. |
|
I am just trying to have an abstraction that can replace the BytesRef output for binary doc values with something that does not impose the internal representation of the bytes like BytesRef does. I hope this will avoid some extra byte copying when reading and when writing and avoid the risk of a humongous allocation when reading binary doc values. I am ok if we said that RandomaccessInput is not the right abstraction and we need something else. On the other hand it seems to fit well. |
|
@uschindler I merged the change. I tried to backported but it is not possible ByteBuffer#get(int, byte[], int, int) is not available in the java version on line 9.x. I think it is ok to leave this change until 10. |
|
Hi @iverase,
I changed the Policeman Jenkins MMAP job back to Lucene Main branch. The next run would have failed as you deleted the branch! |
|
P.S.: See docs here. The method came with Java 13. |
|
@iverase, I think you have to move the changes entry to Lucene 10. |
I did it already in ba74da1
Oops, I delete them mechanically, glad you changed quickly.
I think that is the best approach, We can do that if there is need to have it in the 9.x branch. |
Adds a new method to RandomAccessInput tio bulk read bytes into a provided byte array.
Adds a new method to RandomAccessInput tio bulk read bytes into a provided byte array. The default implementation reads byte by byte but faster implementations are provided for all lucene implementations.
@uschindler I tried to provide implementations for the
MemorySegmentIndexInputProvider. I am a bit confused because I runTestMmapDirectoryseveral times hoping it will cover single and multiple segments.I never saw it running woth multiplr segments, is there another test or something I need to do? Anyway, it would be great if you can review it.closes #12599