Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentd refuses connections when buffer is too loaded #3993

Open
Adrianos712 opened this issue Dec 20, 2022 · 1 comment
Open

Fluentd refuses connections when buffer is too loaded #3993

Adrianos712 opened this issue Dec 20, 2022 · 1 comment

Comments

@Adrianos712
Copy link

Describe the bug

We are using Banzaicloud Logging Operator on Openshift to provide log forwarding in a multi tenant way.
Fluentd sends all logs to a Kafka with credentials per user.
On our testbed platform, many users still have a fluentd configuration but don't have valid credentials resulting in a buffer directory (/buffers/ in our case) that is growing more and more despite the fluentd buffer configuration to drop logs after a certain amount of time.

As 1 log line creates 2 files in /buffers, we have hundred of thousand logs/files stuck in that /buffer directory and a simple ls /buffers can take 2-5minutes.

The more file there is in the buffer directory, the less fluentd is responding to new connections (fluentbit or prometheus scraping).
I beg there could be a kind of freeze of the ruby process while it's waiting the OS to execute some command in /buffers ?

To Reproduce

Create an output to fluentd that is rejecting logs and start accumulate logs in fluentd buffer directory.
Try to connect to fluentd listening sockets.

Expected behavior

Fluentd not rejecting connections even if buffer is heavy loaded

Your Environment

- Fluentd version: 1.14.6
- Source container: ghcr.io/banzaicloud/fluentd:v1.14.6-alpine-6
- Kube distrib: openshift 4.8.51

Your Configuration

<buffer tag,time>
      @type file
      chunk_limit_size 256MB
      overflow_action drop_oldest_chunk
      path /buffers/flow:namespace:outputname:output::namespace:outputname.*.buffer
      retry_forever false
      retry_timeout 1h
      timekey 1m
      timekey_use_utc true
      timekey_wait 30s
      total_limit_size 1GB
    </buffer>

Your Error Log

None, error logs are in fluentd clients like fluentbit:
log-collector-ocp-1-fluentbit-8xqh7 fluent-bit [2022/11/29 14:51:52] [error] [net] TCP connection failed: log-collector-ocp-1-fluentd.log-collector.svc:24240 (Connection refused)

Additional context

No response

@github-actions
Copy link

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants