BulkIndexer: Workers flushing at the same time

## The Problem
The workers tend to flush at (roughly) the same time.

The main reason for this is that worker buffers are filled evenly, because all workers fetch their items from the same Go channel in parallel.

The buffer expiration made it worse, because so far (up to go-elasticsearch 8.7), there is one ticker that flushes all workers at the same time. #624 fixed this for 8.8+.

The flushing at the same time has several bad effects:
- peak in memory usage: the bulk indexer items are kept in memory until flush, the buffer memory (HTTP body) is allocated/filled at the same time
- peak in CPU consumption: the HTTP bodies are generated and compressed at the same time
- peak in network usage: all requests to ES go out in parallel
- same peak effects on the Elastic Search side, leading to 429 responses more often than needed (and expected), consequentially leading to more retries, which amplifies the above peak behavior

## Possible solution: Fill buffers sequentially, flush in background
(Out of scope is changing the API)

Let's say we have number of workers set to N with `FlushBytes` B and FlushInterval I.
Let the Add() function collect items into an array A0 until B or I is reached, then flush in the background.
Further calls to Add() are going into a new array A1 until B or I is reached, then flush in the background.
...
Allow maximal N background flushes. If reached, throttle ingestion in the Add() function (as we do now).

Pros:
- spread the workload over time to reduce peak behavior and pressure on ES
Cons:
- ?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BulkIndexer: Workers flushing at the same time #646

The Problem

Possible solution: Fill buffers sequentially, flush in background

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BulkIndexer: Workers flushing at the same time #646

Description

The Problem

Possible solution: Fill buffers sequentially, flush in background

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions