Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otelcol] Prevent overwriting of pipelines #1142

Closed

Conversation

djaglowski
Copy link
Member

@djaglowski djaglowski commented Sep 24, 2023

Changes

Currently, some elements of the pipelines defined in otelcol-config.yml are clobbered by conflicting definitions in otelcol-observability.yml. Specifically, the exporters list in the traces and metrics pipelines.

# in otelcol-config.yml
pipelines:
  traces:
    ...
    exporters: [logging, spanmetrics] # blown away by list in otelcol-observability.yml
  metrics: 
    ...
    exporters: [logging]              # blown away by list in otelcol-observability.yml
# in otelcol-observability.yml
pipelines:
  traces:
    exporters: [otlp, logging, spanmetrics] # blows away list in otelcol-config.yml
  metrics:
    exporters: [prometheus, logging]        # blows away list in otelcol-config.yml

Since we expect users to customize the configuration in many cases, I see the following issues:

  1. If a user adds an exporter directly to one of the lists in otelcol-config.yml, it has no effect. There's no error or obvious way to detect this - it's just silently ignored.
  2. The two exporter lists in otelcol-config.yml are effectively ignored so they could be deleted. However, this would mean that it is not a valid config file on its own. i.e. A user who excludes otelcol-observability.yml may be surprised to find that this breaks the service.

The solution I'm proposing uses the forward connector to mostly decouple the two files.

  • Each pipeline in otelcol-config.yml is split into two, connected by the forward connector. One pipeline receives and processes data. The other exports. The functionality is the same, but the pipelines are no long clobbered and its possible to exclude otelcol-observability.yml.
  • The pipelines in otelcol-observability.yml are separated from those in otelcol-config.yml. Instead, they receive data from forward connector. This allows any additional components to be added to pipelines in either file without conflicts.

The downside of this approach is that it increases the apparent complexity of the configuration. This may be a barrier to entry for some users, even if it is less error prone.

Merge Requirements

For new features contributions please make sure you have completed the following
essential items:

  • CHANGELOG.md updated to document new feature additions
  • Appropriate documentation updates in the docs
  • Appropriate Helm chart updates in the helm-charts

Maintainers will not merge until the above have been completed. If you're unsure
which docs need to be changed ping the
@open-telemetry/demo-approvers.

@djaglowski djaglowski marked this pull request as ready for review September 24, 2023 14:38
@djaglowski djaglowski requested a review from a team September 24, 2023 14:38
@julianocosta89
Copy link
Member

Hello @djaglowski, thanks for that!
I like the suggestion and I'd be up for adding it!

This is actually a problem that is there for some time already: open-telemetry/opentelemetry.io#3065

Your approach would solve it in a better way!

@djaglowski
Copy link
Member Author

Thanks for the feedback @julianocosta89. I've added a changelog and proposed open-telemetry/opentelemetry.io#3311 to update the docs.

It's not clear to me if the same changes need to be applied to the kubernetes/helm config & docs because it seems users may be more inclined to modify the primary config directly. What do you think?

Copy link
Member

@julianocosta89 julianocosta89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@julianocosta89
Copy link
Member

Regarding the Helm chart I do agree with you.
Just pinging @puckpuck to bring him to the discussion

@puckpuck
Copy link
Contributor

I'm not quite clear on the use case we are trying to solve for here. We have otelcol-config-extras.yml which is where people and vendors should be doing all changes.

The demo should strive to try and adhere to idiomatic conventions where possible on how people should be instrumenting and managing the associated telemetry for their applications. The collector configuration in this PR significantly complicates the configuration away from those conventions.

I think we should clarify how to extend the telemetry pipeline in docs using otelcol-config-extras.yml as mentioned in this issue

@djaglowski
Copy link
Member Author

I'm not quite clear on the use case we are trying to solve for here.

My use case is as follows: I'd like to export all telemetry data via one additional exporter.

We have otelcol-config-extras.yml which is where people and vendors should be doing all changes.

The demo should strive to try and adhere to idiomatic conventions where possible on how people should be instrumenting and managing the associated telemetry for their applications.

The problem is that we're asking users to write configuration in the extras file which clobbers configuration in the other files.

This is not simple or idiomatic in my opinion. Anecdotally, I'm a maintainer of the collector but I still spent hours struggling to accomplish my use case because I was of a mind that I could make additive changes in one file without needing to be concerned for the others.

To oversimplify the situation in order to highlight the problem, we basically have the following:

# otelcol-config.yaml
exporters: [a, b]

# otelcol-observability.yaml
exporters: [a, b, c]

# otelcol-extras.yaml
# Add whatever you want to "exporters"!
exporters: [z] # User "added" z

The end result here is that [a, b, c] clobbers [a, b], and then [z] clobbers [a, b, c].

  1. There's no point in defining [a, b] in the first place, if we expect everyone to let it be clobbered.
  2. More to the point, we shouldn't encourage novice users to clobber anything. The worst part is that there's no error message that says this is happening because this is "expected" behavior when merging config elements that are lists. The user may notice that they aren't seeing data at a, b, or c, but the reason why is not going to be apparent to most users.

Effectively, we're presenting the extras file as an independent sandbox, but it seems to me that the design and guidance will steer most users to unintentionally invalidate elements of the other files.

The collector configuration in this PR significantly complicates the configuration away from those conventions.

I think whatever conventions we were trying follow were misguided and will lead to frustration more often than results.

I agree that my proposed changes are more complicated, if the goal is for novice users to grasp the totality of the configuration. However, if the goal is to make it easy for new users to add their own backend, then I think what I've proposed is a much more reliable solution.

I think we should clarify how to extend the telemetry pipeline in docs using otelcol-config-extras.yml as mentioned in this open-telemetry/opentelemetry.io#3065

There is an accompanying PR here which I believe clarifies the expectations for those running with docker compose. The issue you mentioned appears to specifically relate to the kubernetes equivalent. I'd rather defer to others as to whether the same changes make sense there, but my intuition was that it is less of an issue.

@julianocosta89
Copy link
Member

julianocosta89 commented Sep 27, 2023

I have to disagree @puckpuck.
Besides the cases already shown by @djaglowski in his initial description: #1142 (comment)

The point 1 is that we have some users struggling to use the bring-your-own-backend instructions since we introduced spanmetrics.
Second, atm, our docs do not create a 2nd pipeline, which causes the overwrite of the traces:

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/example]

And breaks the metrics pipeline.

By simply creating a new pipeline we would avoid breaking the default ones:

service:
  pipelines:
    traces/byob:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/example]

But in this approach we would have to duplicate everything.

In the suggested solution, we demonstrate a new connector and the code below wouldn't break anything:

service:
  pipelines:
    traces/example:
      receivers: [forward]
      processors: [] # optional
      exporters: [otlphttp/example]

Another great point of the forward connector is that when used in the metrics pipeline we also get the spanmetrics into it (because it was defined before), so we do not need to keep re-adding receivers and exporters .

@mviitane
Copy link
Member

I have also struggled quite a bit with the current Collector config. I like this new proposal because when bringing a new backend, I don't need to care what's in the otelcol-observability.yml. As I understood, this fully removes the dependency between otelcol-config-extras.yml and otelcol-observability.yml.

To help users with this new setup, a picture would be nice, showing the forward connector and 3 different config files, with an example pipeline to the user’s own backend.

@puckpuck
Copy link
Contributor

puckpuck commented Sep 27, 2023

My issue is this will complicate a collector configuration that people use as a reference implementation to make it slightly easier to bring your backend. And even that slightly easier, I will argue, since people will need to know to use forward as the receiver. That will need to be documented. Similar is true if we document for people to include spanmetrics as an exporter for traces if they override that pipeline. In both cases, we need to enrich our docs.

This PR is just trading one problem (documenting spanmetrics in exporter) with another (documenting forward in receiver), while also making the reference configuration far more complicated.

This is what Honeycomb does for bring your own backend, all configured in otel-config-extras.yml. The key to making this work was to include spanmetrics in the exporter list for the traces pipeline. That's the part that needs to be documented.
https://github.com/honeycombio/opentelemetry-demo/blob/main/src/otelcollector/otelcol-config-extras.yml

@djaglowski
Copy link
Member Author

This PR is just trading one problem (documenting spanmetrics in exporter) with another (documenting forward in receiver), while also making the reference configuration far more complicated.

The problem I'm trying to solve is neither of these.

In any case, I think that if the goal is to provide a simple reference implementation, there should be only one config file so that no merging is necessary.

@puckpuck
Copy link
Contributor

This PR is just trading one problem (documenting spanmetrics in exporter) with another (documenting forward in receiver), while also making the reference configuration far more complicated.

The problem I'm trying to solve is neither of these.

In any case, I think that if the goal is to provide a simple reference implementation, there should be only one config file so that no merging is necessary.

We only do the merging to make it easier to bring your own backend. We should probably add comments to the other files instructing people not to modify them and instead add what they want to the otel-config-extras.yml file instead.

@djaglowski
Copy link
Member Author

We only do the merging to make it easier to bring your own backend. We should probably add comments to the other files instructing people not to modify them and instead add what they want to the otel-config-extras.yml file instead.

The whole point is that modifying otel-config-extras.yml is problematic because it effectively destroys content in the other files. It doesn't make anything easier and you can't solve this with comments.

@puckpuck
Copy link
Contributor

The whole point is that modifying otel-config-extras.yml is problematic because it effectively destroys content in the other files. It doesn't make anything easier and you can't solve this with comments.

Kind of. It really only clobbers arrays. The same is true when working with Helm charts, where part of the inspiration for this approach was adopted.

We have 3 files because we wanted:

  • a clean config with no backend at all
  • a config that only defines observability backends
  • a config to bring your own backend

Where we failed is we did not document this in any capacity anywhere. It was just discussed in SIG meetings and implemented. We should document this as both comments in the config files, and on the website for the demo itself. The only "gotcha" is we need to be clear that if overriding the exporter in the traces pipeline, you need to include spanmetrics or an error will occur.

@github-actions
Copy link

github-actions bot commented Oct 5, 2023

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 5, 2023
@puckpuck puckpuck mentioned this pull request Oct 12, 2023
2 tasks
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants