Skip to content

Update/Reload without downtime #4622

Closed
@daipom

Description

@daipom

Is your feature request related to a problem? Please describe.

Updating Fluentd or reloading a config causes downtime.
Plugins that receive data as a server, such as in_udp, in_tcp, and in_syslog, cannot receive data during this time.
This means that the data sent by a client is lost during this time unless the client has a re-sending feature.
This makes updating Fluentd or reloading a config difficult in some cases.

Describe the solution you'd like

Add a new feature: Update/Reload without downtime.

For example, implement a mechanism similar to nginx's feature for upgrading on the fly.

The main problem is that Fluentd can't run in parallel with the same config.
(It causes some conflicts, such as buffer files)

Because of this problem, it is very difficult to support all plugins.
However, it is possible to support only plugins that can run in parallel.

Based on the above, the following mechanism would be a good way to achieve this.

  1. The current supervisor receives a signal.
  2. The current supervisor sends signals to its workers, and the workers stop all plugins that cannot run in parallel.
  3. The current supervisor starts a new supervisor.
    • => Old processes and new processes run in parallel.
  4. After the new supervisor and its workers start to work, the current supervisor and its workers stop.

More specifically, it would be better to run only limited Input plugins in parallel, such as in_tcp, in_udp, and in_syslog.
Stop all plugins except those Input plugins, and prepare a dedicated file buffer for Output.
After the new workers start, they load the file buffer and route those events to the @ROOT label.

Describe alternatives you've considered

None.

Additional context

I have already started to create a PoC.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions