Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local configuration is overwritten #32048

Open
mike9421 opened this issue Mar 31, 2024 · 11 comments
Open

Local configuration is overwritten #32048

mike9421 opened this issue Mar 31, 2024 · 11 comments

Comments

@mike9421
Copy link

mike9421 commented Mar 31, 2024

Component(s)

cmd/opampsupervisor

What happened?

Description

When I run opampsupervisor(the backend program is OpAMP's sample server), I noticed that EffectivConfig gets cleared, leaving only the contents of ownMetricsCfg and ExtraLocalConfig.

Steps to Reproduce

  1. Fill in the correct configuration to ensure that the opampsupervisor can connect to the backend normally
  2. Start the backend OpAMP sample server
  3. Fill in the effective configuration in local effective.yaml
  4. Start opampsupervisor
  5. Check the otel configuration displayed by the OpAMP UI or view the local file named effective.yaml.

Expected Result

The content I filled in to the local configuration file named effective.yaml will not be cleared after the connection is established.

Actual Result

The configuration content I filled in was cleared

Collector version

all

Environment information

Environment

OS: Darwin
Compiler: go 1.21

OpenTelemetry Collector configuration

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: otel-collector
          scrape_interval: 10s
          static_configs:
            - targets:
              - 0.0.0.0:55149
exporters:
  otlphttp:
    endpoint: http://localhost:4318/v1/metrics
service:
  pipelines:
    metrics:
      exporters:
        - otlphttp
      receivers:
        - prometheus

Log output

No response

Additional context

I know the reason is that the backend server returns an empty remote configuration. However, I believe that the opampsupervisor needs to handle such situations to avoid the issue where configurations get overwritten due to logic errors in the backend.

@mike9421 mike9421 added bug Something isn't working needs triage New item requiring triage labels Mar 31, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

The reason for this behavior is that the effective.yaml file is only intended to be written by the Supervisor. Can you explain more about your use case? The Supervisor specification allows for the possibility of using local Collector config along with remote config, would this work for you?

@evan-bradley evan-bradley added enhancement New feature or request priority:p2 Medium and removed bug Something isn't working needs triage New item requiring triage labels Apr 16, 2024
@mike9421
Copy link
Author

The reason for this behavior is that the effective.yaml file is only intended to be written by the Supervisor. Can you explain more about your use case? The Supervisor specification allows for the possibility of using local Collector config along with remote config, would this work for you?

Thanks for your reply. I know that the effective.yaml file is meant for the supervisor to write, and the supervisor retrieves remote configurations from the OpAMP server.

The phenomenon described in this issue is that when the remote configuration provided by the OpAMP server is empty (item.body is "" ),

err = k2.Load(rawbytes.Provider(item.Body), yaml.Parser())

the final configuration will become the default configuration provided by the supervisor.

Thecause of this issue is that the OpAMP server sends an empty configuration when the connection is first established. Has OTel considered implementing restrictions for this scenario?"(for example, using the saved configuration)

@tigrannajaryan
Copy link
Member

Thecause of this issue is that the OpAMP server sends an empty configuration when the connection is first established.

Why is the server doing this? I don't think this is compliant with the spec. Which server implementation is this?

@mike9421
Copy link
Author

Thecause of this issue is that the OpAMP server sends an empty configuration when the connection is first established.

Why is the server doing this? I don't think this is compliant with the spec. Which server implementation is this?

@tigrannajaryan I'm using the OpAMP-go server example.

When a connection is established for the first time, since the OpAMP server does not have the OTel's RemoteConfig, it will return an empty configuration (non-nil)

@tigrannajaryan
Copy link
Member

I think the server should send back the config after the agent sends the first message that contains the AgentDescription message. If that is not happening then I think it is a server bug.

@mike9421
Copy link
Author

I think the server should send back the config after the agent sends the first message that contains the AgentDescription message. If that is not happening then I think it is a server bug.

I also think it is a server error. I would like to ask if OTel needs to deal with this situation? After all, it is very important to ensure that the agent configuration is effective in remote configuration.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Aug 19, 2024
@mike9421
Copy link
Author

I think the server should send back the config after the agent sends the first message that contains the AgentDescription message. If that is not happening then I think it is a server bug.

I also think it is a server error. I would like to ask if OTel needs to deal with this situation? After all, it is very important to ensure that the agent configuration is effective in remote configuration.

If this happens, will OTel judge this better to ensure the stability of OTel?

@tigrannajaryan
Copy link
Member

I also think it is a server error. I would like to ask if OTel needs to deal with this situation? After all, it is very important to ensure that the agent configuration is effective in remote configuration.

I am not sure what exactly the Supervisor can do in this situation if the server misbehaves. I suggest to file a bug against the server (please include the repro steps).

@github-actions github-actions bot removed the Stale label Aug 21, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants