Skip to content

IndexWriter.merge does not properly clean up merge reader instances #14930

Open
@thecoop

Description

@thecoop

Description

Following on from #14844 (comment), there are cases where IndexWriter.merge does not close merge instances before closing the 'root' readers, leading to the necessity of the extra check in AssertingKnnVectorsReader.close.

In particular, SegmentMerger (which creates and holds the merge reader instances) is not cleaned up if an exception is thrown later, which is where the original problem was found. There may of course be more instances elsewhere.

Currently, there's some ambiguity in the role of merge reader instances - are they clones? Do they have their own independent lifecycle? Are they dependent on their 'parent' reader at all? They were just clones, but the introduction of finishMerge introduces a lifecycle that is not quite independent - add on top of the restrictions around how ReadAdvice needs to be managed.

I think the way to solve this is to make merge instances properly closeable by replacing finishMerge with a full close implementation, with their own independent lifecycle. SegmentMerger would then be closeable, as part of tidying up after an exception during merging. Most readers wouldn't actually need the independence given to them, but it's better to be more permissive in what merge readers are, which can be ignored by most implementations, than less permissive, leading to lifecycle issues when a merge reader can't do what it needs to by its implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions