-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Segment-based replication / Remote Store is not compatible with Lucene 9.12 / 10 #15902
Comments
@reta Thanks for raising. From the trace looks like this applies to any node-node file copy which would include peer recovery. |
It might be, the presence of
specifically points out to segment replication but to your point, it is not limited to it, thanks @mch2 ! |
yep thats right the transfer handler is shared between both functions, will take a look at this |
Ok this is because we are using |
Thought maybe we could get by with clones on the original inputs bc the clones themselves aren't closed but the creation of them ensures the original is confined via buildSlice here, this prevents us from creating clones from separate threads. |
I also suspected this but changing to READ was not helping, at least with |
yeah the wrapper is still enforcing the use of READONLY only on segment_n files. I'm not sure of the reasoning there but I figure most will xfer within a single chunk, so i've made a fix #15922 on top of your changes that will conditionally use a READONLY context for those and re-open the input per chunk see - 9df169a |
Checking on that draft theres quite a few tests still breaking because we are reading segments* files with a non readonce context. I think in some of these we may be ok flipping those to READONCE, ie recovery/upload where the files would likely be read once and not see a benefit from being cached by the os. What i've gathered so far is these are: IndexShard: During remote store restore flows
|
Thanks a lot @mch2 |
Getting there, see please #15333 |
Describe the bug
I have been working on routine Apache Lucene 9.12 update (snapshot) but run into show stopper:
The issue comes from the fact than Apache Lucene uses FF&MI APIs and more specifically,
MemorySegmentIndexInput
which is backed by confined memory segment that is not supposed to be shared between multiple threads.At this moment, I don't not know the exact solution to this, more eyes / minds would be certainly beneficial.
Related component
Storage:Remote
To Reproduce
See please #15333
Expected behavior
The tests should be passing.
Additional Details
Plugins
Standard
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
See please #15333
The text was updated successfully, but these errors were encountered: