Skip to content

Throttling incoming indexing when Lucene merges fall behind #6066

Closed
@mikemccand

Description

@mikemccand

Lucene has low-level protection that blocks incoming segment-producing threads (indexing threads, NRT reopen threads, commit, etc.) when there are too many merges running.

But this is too harsh for Elasticsearch, so it's entirely disabled, but this means merges can fall far behind under heavy indexing, and this results in too many segments in the index, which causes all sorts of problems (slow version lookups, too much RAM, etc.).

So we need to do something "softer"; Simon has a good starting patch, which I tested and confirmed (after https://issues.apache.org/jira/browse/LUCENE-5644 is fixed) at least in one use-case that it prevents too many segments in the index:

Before Simon's + Lucene's fix: http://people.apache.org/~mikemccand/lucenebench/base.html

Same test with the fix: http://people.apache.org/~mikemccand/lucenebench/throttled.html

Segment counts stay essentially flat.

Here's Simon's prototype patch: s1monw@2de96f9

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions