Skip to content

Memory leak because of stats collection #1820

@mpolyakov-plutoflume

Description

@mpolyakov-plutoflume

Description

After enabling collecting stats (using stats_cb) I've noticed, that some of our services started consuming much more memory. I was not able to trace it back exactly to confluent_kafka library, but disabling stats collection gets rid of the problem.

How to reproduce

I've tried to make a minimal example and came across an unexpected behavior:

import datetime
import json
import time

from confluent_kafka import Producer


def _stats_cb(stats: str) -> None:
    json.loads(stats)
    print(f"Got stats at {datetime.datetime.now().isoformat()}")


producer = Producer(
    {
        "bootstrap.servers": "PLAINTEXT://localhost:29092",
        "stats_cb": _stats_cb,
        "statistics.interval.ms": 100,
    },
)

producer.produce(topic="email_check.outbound.check.triggered", value=json.dumps({"foo": "bar"}))
print("produced message")
time.sleep(1)
producer.flush()
print("flushed")

In this case, we would see, that _stats_cb has been called ten times, when the flush is called.
My assumption is, that if a service publishes a kafka record relatively infrequently, callbacks build up and can lead to memory issue.

Is this an intended behaviour?

Checklist

Please provide the following information:

  • confluent-kafka-python '2.4.0', 33816576, librdkafka '2.4.0', 33816831'
  • Apache Kafka broker version: 2.8.2
  • Client configuration: see example
  • Operating system: macOs
  • Provide client logs (with 'debug': '..' as necessary)
  • Provide broker log excerpts
  • Critical issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugReporting an unexpected or problematic behavior of the codebasecomponent:librdkafkaFor issues tied to the librdkfka elementsinvestigate furtherIt's unclear what the issue is at this time but there is enough interest to look into itpriority:highMaintainer triage tag for indicating high impact or criticality issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions