Skip to content

Store IO throttling throttles far more than asked #6018

Closed
@mikemccand

Description

@mikemccand

I've been digging into the "merges can fall behind" at high indexing
rates, and I discovered some serious issues with the IO throttling,
which we recently (#5902) up'd from 20 MB/sec to 50 MB/sec by default.

Net/net I think when we ask for 50 MB/sec today we are really
throttling at something like 8 MB/sec!

Details:

I indexed a bunch of small log-file type docs into 1 shard, 0
replicas, using 1 sync _bulk client, to the point where it did it's
first big-ish merge (611 MB, 440K docs); the merge does not use CFS so
it's really writing 611 MB. I'm using a fast SSD.

With no throttling (index.store.throttle.type=none), the merge takes
20.8 seconds.

With the default 50 MB/sec merge throttling, it takes 72.1 sec, which
far too long (611 MB / 50 = 12.2 sec). The rate limiter enforces the
instantaneous rate, so at worse the merge time should have been 20.8 +
12.2 = 33 sec but likely much less than that because merging takes
CPU time.

So I dug in and discovered one problem, I think caused by the
super.flush and then delegate.flush in BufferedChecksumIndexOutput,
where the RateLimiter is always alternately called first on 8192 bytes
then on 0 bytes. If I fix RateLimiter to just ignore those 0 bytes,
the merge time with 50 MB/sec throttle drops to 49.9 sec: better, but
still too long. (I think once we cutover to Lucene's checksums this 0
byte issue will be fixed?)

System.nanoTime is actually quite costly, so I suspect the overhead of
just checking whether to pause, and of calling Thread.sleep, is way
too much when the pause time is small. So I change SimpleRateLimiter to
just accumulate the incoming bytes and then once it crosses 1 msec
worth at the specified rate, invoke the pause logic.

This really improved it: now the merge takes 25.7 sec at 50 MB/sec
throttle, and 64.9 sec at 10 MB/sec throttle. These times seem correct.

I'll also open a Lucene issue to fix this, and make an XRateLimiter
for ES in the meantime.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions