Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append components flag #8754

Open
djaglowski opened this issue Oct 25, 2023 · 7 comments
Open

Append components flag #8754

djaglowski opened this issue Oct 25, 2023 · 7 comments

Comments

@djaglowski
Copy link
Member

Is your feature request related to a problem? Please describe.
When merging configurations, it would be helpful to be able to do the following:

  • Add a receiver, processor, or exporter to an existing pipeline
  • Add an extension to the service

For example, consider the following configs.

# main.yaml
receivers:
  otlp/in:
processors:
  batch:
exporters:
  otlp/out:
extensions:
  file_storage:

service:
  pipelines:
    traces:
      receivers: [ otlp/in ]
      processors: [ batch ]
      exporters: [ otlp/out ]
  extensions: [ file_storage ]
# extra_receiver.yaml
receivers:
  foo:

service:
  pipelines:
    traces:
      receivers: [ foo ]
# extra_extension.yaml
extension:
  bar:

service:
  extensions: [ bar ]

If a user provides the above configs in the same order, it will clobber service::pipelines::traces::receivers and service::extensions, effectively removing the otlp/in receiver and file_storage extension from the configuration.

Describe the solution you'd like
I propose that these lists are components are a special case and that we should provide the option to append to them, rather than clobber them.

Specifically I propose that we add a flag to the collector, e.g. --append-components or --merge-components-mode=append, which will enable this behavior when merging configs.

Describe alternatives you've considered
The same idea could be applied at a more granular level. A simple option would be to have a similar series of flags, e.g. --append-receivers, ...

Additional context
Several recent SIG meetings have included some discussion of idea along these lines. Additionally, some issues may be related.

@pavolloffay
Copy link
Member

+1 on the proposal.

Are there any updates on this?

@tigrannajaryan
Copy link
Member

Specifically I propose that we add a flag to the collector, e.g. --append-components or --merge-components-mode=append, which will enable this behavior when merging configs.

Why does this need to be an option? Is there a case where appending the pipeline lists is undesirable?

Also, the list-appending logic should be name-aware so that if the same-named component is in both input lists it only appears once in the destination list.

@djaglowski
Copy link
Member Author

Why does this need to be an option? Is there a case where appending the pipeline lists is undesirable?

I'm not sure if there are cases where overwriting the list is actually preferred but it is the current behavior. Therefore, I proposed this as opt-in behavior in order to respect backwards compatibility.

If there is consensus, we could just implement this with a feature gate. This would be equivalent to the initial proposal while the gate is alpha. Then when we move it to beta we'd presumably hear if anyone is relying on the current behavior, while still leaving them a way to opt out.

Also, the list-appending logic should be name-aware so that if the same-named component is in both input lists it only appears once in the destination list.

Agreed

@VihasMakwana
Copy link
Contributor

@djaglowski are we actively seeking contributions for this enhancement? I have encountered this limitations many times while using multiple configs.

@djaglowski
Copy link
Member Author

@VihasMakwana, IMO it would be a nice contribution but I would suggest that the @open-telemetry/collector-maintainers should weigh in on this.

@andrzej-stencel
Copy link
Member

In my opinion the possibility of replacing lists (as it works today) is still valuable and I don't want to remove it by only providing the possibility to append (add) components to lists.

Replacing is valuable in customization of pipelines that may have some default behavior for users to modify. Examples:

  • User was provided with a configuration to collect metrics from Kubernetes using a Prometheus receiver, but they want to use the OTLP receiver instead to collect the metrics. Or maybe they want to use the OTLP JSON File receiver for some testing.
  • User was provided a configuration that uses a processor (like Filter or Transform or Log Deduplication processor) to remove part of the telemetry by default. The user wants to remove that processor or set of processors to see all the data that is coming through the pipeline. Or maybe they want to further customize the was the telemetry is processed by adding, removing and reordering processors.
  • User was provided with a configuration that uses a vendor exporter to send data. The user wants to replace the vendor exporter with the OTLP exporter or Debug exporter or File exporter.

I agree the possibility to append (add) components to lists is valuable. Ideally we should come up with a solution that makes it possible to do both/either, even in the same configuration file. User might want to append to the list of extensions, but replace the list of receivers/processors/exporters in a pipeline. Or append to receivers in one pipeline, but replace the list of receivers in another pipeline.

One simplistic solution that would allow this is to add new list properties like add_extensions, add_receivers, add_processors, add_exporters beside extensions, receivers, processors, exporters. If the user specifies add_extensions, the extensions listed in it would be added to the existing list. If the user specifies extensions, existing behavior is used (replacing the list). User can only specify either one or the other in a single configuration file. I'm not sure if this would be easy to implement or if it wouldn't make the configuration too complicated for users. Example:

base.yaml:

service:
  pipelines:
    logs/1:
      receivers:
        - otlp
      processors: [...]
      exporters:
        - otlp
    metrics/1:
      receivers:
        - otlp
      processors: [...]
      exporters:
        - otlp

user.yaml:

service:
  pipelines:
    logs/1:
      receivers:
        - otelarrow
      add_exporters:
        - otelarrow
    metrics/1:
      add_receivers:
        - prometheus
      exporters:
        - debug

The effective configuration from running otelcol-contrib --config=base.yaml --config=user.yaml would be:

service:
  pipelines:
    logs/1:
      receivers:
        - otelarrow
      processors: [...]
      exporters:
        - otlp
        - otelarrow
    metrics/1:
      receivers:
        - otlp
        - prometheus
      processors: [...]
      exporters:
        - debug

@VihasMakwana
Copy link
Contributor

@andrzej-stencel I agree with your perspective. Both merging strategies have their merits.

Ultimately, the choice should be left to the user, either through a configuration setting or a feature gate. Additionally, the existing merging strategy should continue to be the default option.


It seems we can leverage the existing merging logic from this snippet used in contrib repo, which currently handles the merging of extensions. However, it should be modified to be name-aware.

We can extend this approach to also handle the merging of processors, receivers, and exporters in a similar manner.

I request @open-telemetry/collector-maintainers to provide their feedback on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants