Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status subcommand reporting agent status for otel mode - Phase 2 #4047

Merged
merged 11 commits into from
Jan 22, 2024

Conversation

michalpristas
Copy link
Contributor

@michalpristas michalpristas commented Jan 9, 2024

This is a phase 2 of making status for otel work properly

Bolds are differences with previous Phase

Phase 1:

  • agent capable of running in normal OR otel mode
  • coordinator not running at all
  • otel collector running in a separate thread
  • otel agent not able to report state

Phase 2: We are here

  • agent capable of running in normal OR otel mode
  • coordinator running with dumb config provider (no config)
  • otel collector running in a separate thread
  • status reports only agent status (otel status not reported)

Fixes partially: #3872

@michalpristas michalpristas added enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Elastic-Agent Label for the Agent team backport-skip skip-changelog labels Jan 9, 2024
@michalpristas michalpristas self-assigned this Jan 9, 2024
@michalpristas michalpristas marked this pull request as ready for review January 10, 2024 08:00
@michalpristas michalpristas requested a review from a team as a code owner January 10, 2024 08:00
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pierrehilbert
Copy link
Contributor

Failure related to #3342

@michalpristas michalpristas removed the request for review from faec January 10, 2024 08:23
@ycombinator
Copy link
Contributor

@michalpristas Just to clarify, phase 1 was implemented as part of #3800, right?

@ycombinator
Copy link
Contributor

ycombinator commented Jan 10, 2024

Also, I'm confused by this:

Bolds are differences with previous Phase

Phase 1
otel collector running in a separate thread

Phase 2
otel collector running in a separate thread

What is the difference (indicated by the bolding in Phase 2)?

@michalpristas
Copy link
Contributor Author

my bad

@ycombinator
Copy link
Contributor

I built Agent from this PR's branch, extracted it, and ran $ ./elastic-agent -c otel.yml status. I'm getting the following output:

Error: failed to communicate with Elastic Agent daemon: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /private/tmp/elastic-agent-8.12.0-SNAPSHOT-darwin-aarch64/elastic-agent.sock: connect: no such file or directory"
For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.12/fleet-troubleshooting.html

Is this expected behavior, even with the changes in this PR?

Please ignore the 8.12 references above; that's because we're currently having an issue with building Agent from main with EXTERNAL=true; the workaround is to also specify BEAT_VERSION=8.12.0.


BTW, I re-wrote the issue that this PR is supposed to resolve to include more details: #3872. Not sure if that changes the requirements for this PR or if a follow up PR will be needed to completely resolve that issue. Apologies if the original issue description was not clear enough.

@michalpristas
Copy link
Contributor Author

michalpristas commented Jan 12, 2024

the workflow i tested

  • mage package agent
  • sudo ./elastic-agent run -c otel.yml (alt test without -c to see it works with normal agent behavior)
  • keep it running
  • other terminal sudo ./elastic-agent status it should show output -c flag is not needed

what are your expectations of running ./elastic-agent -c otel.yml status on its own?

@ycombinator
Copy link
Contributor

ycombinator commented Jan 12, 2024

what are your expectations of running ./elastic-agent -c otel.yml status on its own?

None, I had a brain fade 😊. Of course this cannot expected to work as we need a running Agent + GRPC server for status to talk to. 🤦

Your workflow makes sense for running Agent in OTel mode. I will test it with this PR. Thanks.

@ycombinator
Copy link
Contributor

Confirmed that the steps listed in #4047 (comment) work as expected. Reviewing the implementation now...

@cmacknz
Copy link
Member

cmacknz commented Jan 18, 2024

I think this is probably mostly covered by integration tests and the SonarQube complaint can be bypassed.

status reports only agent status (otel status not reported)

It seems to me like the integration test coverage could be improved to cover at least this, by running elastic-agent status in one of the otel tests and confirming it succeeds. Before this change this command would unconditionally fail wouldn't it?

Right now you have no net new test coverage at all, but there is actually added functionality to cover here.

Copy link

Quality Gate failed Quality Gate failed

Failed conditions

0.0% 0.0% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube

@michalpristas
Copy link
Contributor Author

michalpristas commented Jan 19, 2024

status check added to TestOtelFileProcessing test

Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test!

@cmacknz
Copy link
Member

cmacknz commented Jan 22, 2024

Force merging, ignoring the code coverage complaint because this is now covered in the integration tests.

@cmacknz cmacknz merged commit c588a73 into elastic:main Jan 22, 2024
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip enhancement New feature or request skip-changelog Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants