-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fbreceiver] - Fix batcher's configuration #42797
Conversation
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
I think the correct parameter to map They immediately trigger a flush based on minimum events/records set. For more info https://www.elastic.co/guide/en/beats/filebeat/current/configuring-internal-queue.html#queue-mem-flush-min-events-option |
I initially did it, but as per the same document you linked:
Also, There is one more problem which is solved by this PR (I'll update description):
i hope that makes sense. |
@carsonip Would be great to get your opinion on this change. |
The batcher validates that
I agree with you. Since every event has to go through two queues before reaching ES. But setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the ping and PR lgtm.
2 things to note, both are irrelevant to the PR
- one should NOT use es exporter
batcher::max_size_items
orbatchprocessor
'ssend_batch_max_size
in metrics pipelines as it may break down metrics requests which would cause issues with TSDB. In this PR it is a log pipeline and it is fine. - check out the es exporter
flush::bytes
config. It is respected regardless of batcher settings. If 1 batch is greater thanflush::bytes
(default ~5MB) it will be broken down into 2 bulk requests
This is the bug in upstream exporter. It never validates the batcher configuration and we never run into any error. I'm moving this PR to draft for now until I fix upstream. |
if min_size_items == max_size_items, the (first) batch will always be of that size if it isn't triggered by batcher::flush_timeout. |
On a side note, setting min_size_items == max_size_items may be inefficient, as splitting a request and exporting them separately will incur some overhead (think about 2 bulk requests vs 1). e.g. if the current batched request has 1599 records, and the next request has 2 records, I assume that it will trigger a flush with a request of 1600 records and another request of 1 record, which will make it very inefficient. Would be good to have someone to confirm this. |
You're right. Can you please take a look at open-telemetry/opentelemetry-collector-contrib#38072? I'd like to get that merged before continuing on this one. |
e358fb3
to
71040c5
Compare
de3531f
to
4ae89d9
Compare
@@ -149,6 +149,7 @@ func ToOTelConfig(output *config.C) (map[string]any, error) { | |||
"batcher": map[string]any{ | |||
"enabled": true, | |||
"max_size_items": escfg.BulkMaxSize, // bulk_max_size | |||
"min_size_items": 0, // 0 means immediately trigger a flush |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you test this out? I've always avoided setting otel configs to 0, because they will not override any non-0 default. In this case, I'm not sure if setting it explicitly as 0 will override the default 5000. With your PR in open-telemetry/opentelemetry-collector-contrib#38072 you should be able to tell it right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance testing is currently in progress. It looks good and EPS is similar to native filebeat. I'll manually test it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carsonip attaching a few screenshots for better transparency:
- collector run with
min_size_items: 0
data:image/s3,"s3://crabby-images/06c75/06c75a300c38d44c7b8b8cf4c08cb2ea698d7395" alt="Screenshot 2025-02-21 at 2 20 57 PM"
- collector run while skipping
min_size_items
(it defaults to 5000 in this case)
data:image/s3,"s3://crabby-images/e854c/e854cca41c3eea2829e3c63ba526ffdf1fc0895c" alt="Screenshot 2025-02-21 at 2 21 11 PM"
|
@khushijain21 @mauri870 can you take a look now? I've update PR description with new set of changes. We now set Let me know if you have any concerns. Happy to help you out. |
/test |
* fix: batcher min_size_items * update tests * set min_size_items to 0 * fix tests * fix tests * fix test (cherry picked from commit 2f3df16) # Conflicts: # libbeat/otelbeat/beatconverter/beatconverter_test.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel_test.go
* fix: batcher min_size_items * update tests * set min_size_items to 0 * fix tests * fix tests * fix test (cherry picked from commit 2f3df16) # Conflicts: # libbeat/otelbeat/beatconverter/beatconverter_test.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel_test.go
* fix: batcher min_size_items * update tests * set min_size_items to 0 * fix tests * fix tests * fix test (cherry picked from commit 2f3df16) # Conflicts: # libbeat/otelbeat/beatconverter/beatconverter_test.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel_test.go
* fix: batcher min_size_items * update tests * set min_size_items to 0 * fix tests * fix tests * fix test (cherry picked from commit 2f3df16) # Conflicts: # libbeat/otelbeat/beatconverter/beatconverter_test.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel_test.go
* fix: batcher min_size_items * update tests * set min_size_items to 0 * fix tests * fix tests * fix test (cherry picked from commit 2f3df16)
* fix: batcher min_size_items * update tests * set min_size_items to 0 * fix tests * fix tests * fix test (cherry picked from commit 2f3df16) # Conflicts: # libbeat/otelbeat/beatconverter/beatconverter_test.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel.go # libbeat/otelbeat/oteltranslate/outputs/elasticsearch/config_otel_test.go
Proposed commit message
As per batcher's documentation, there are two configurations related to flushing:
min_size_items
plog.Log
has more records thanmin_size_items
then it is immediately forwarded to the exporter.max_size_items
plog.Log
has more records thanmax_size_items
then it is split into multiple requests.min_size_items
.This PR sets
min_size_items
to 0. This ensures both of the following conditions are satisfied:plog.Log
contains more thanbulk_max_size
items.