Improve handling of large kafka batches

Our Kafka instrumentation currently creates a transaction for each individual message. This can lead to a flood of transactions in some use cases, e.g. if an application uses event sourcing and fetches entire kafka topics during startup.

@eyalkoren explained this in detail in an internal issue:
> It was based on the assumption that for these types of usages, Kafka streams will be used and not Kafka clients and it proved to be valid so far. The challenge with messaging systems is that they may be used either as a way of asynchronous communication between services or for this type of data pipelining, and the default desired behavior is very different in each case. This also affects what you want to do with regards to distributed tracing: it is a crucial APM functionality if use it as a communication facility and it is irrelevant if you use it to process mass amount of records.
>
>Knowing that in some cases creating a transaction per batch is preferable, we introduced the internal [message_batch_strategy](https://github.com/elastic/apm-agent-java/blob/9575d839280b71e6ef771be45e5436d241b68c04/apm-agent-core/src/main/java/co/elastic/apm/agent/configuration/MessagingConfiguration.java#L47), however we did not apply it to our Kafka instrumentation so far. I think the best way to deal with it now is to it, for example- change our iteration instrumentation so that it starts a transaction when the iterator is created and close it when it is empty, and let the customer try with this config option.
>
>Going forward, we should handle our defaults so that we guard against this type of scenarios. For example:
>
> * if batchSize <= 10
>   * if message_batch_strategy==single_handling - create transaction per message
>   * else - create transaction per batch
> * else if 10 < batchSize <= 100
>   * create transaction per batch and use span links if messages contain traceparent header
> * else - create transaction per batch without span links

We later on discovered that `message_batch_strategy` defaults to `batch_handling`, therefore this can't be respected by default for kafka. However, the general idea might be interesting: Automatically capture the processing of large batches automatically as a single transaction and smaller batches with a transaction per message. "Large" could be configurable threshold.

An alternative is for example to instrument the higher-level frameworks using kafka and do the decision based on the knowledge from the higher level framework. E.g. for spring we already create a single transaction for `@KafkaListener(batch=true)`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve handling of large kafka batches #2985

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve handling of large kafka batches #2985

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions