Skip to content

[Upgrade Watcher] InstallChecker incorrectly detects systemd Agent service as not installed #3188

@ycombinator

Description

@ycombinator

For confirmed bugs, please report:

Version: 8.10.0-SNAPSHOT / main

Operating System: Linux / Ubuntu

Steps to Reproduce:

Reproducing this bug involves upgrading to an Agent build where the Agent binary deliberately crashes.

  1. Build the Agent package from the latest main. This Agent will serve as a our source, i.e. pre-upgrade, Agent.

    DEV=true EXTERNAL=true SNAPSHOT=true PLATFORMS=linux/arm64 PACKAGES=tar.gz mage package
    
  2. Unpack and install this Agent.

  3. Now create a branch off main where we will build an Agent that crashes upon start. This Agent will serve as a our target, i.e. post-upgrade, Agent.

    git checkout -b crashing-agent
    
  4. Download and apply the patch that will produce an Agent that can be upgraded to and crashes upon start.

    wget -qO- https://github.com/elastic/elastic-agent/files/12255794/crashing-agent.patch | git apply -
    
  5. Commit the changes so we have a new commit SHA in the crashing Agent's version. Otherwise the upgrade will not happen.

    git commit -am "[Testing] Crashing agent"
    
  6. Build the Agent from the crashing-agent branch. Since the Agent version has been bumped up and the corresponding component binaries won't be available, make sure to set AGENT_DROP_PATH to nothing. Also make sure NOT to use SNAPSHOT=true otherwise the upgrade process will try to download the artifact from the snapshots repository.

    DEV=true AGENT_DROP_PATH= PLATFORMS=linux/arm64 PACKAGES=tar.gz mage package
    
  7. Upgrade from the installed Agent to the crashing Agent.

    sudo elastic-agent upgrade 8.11.0 --source-uri file://build/distributions/ --skip-verify
    
  8. Check the Upgrade Watcher logs. Note that these will be in the target Agent's data path, so make sure to use the appropriate short commit SHA in the path below.

    sudo cat /opt/Elastic/Agent/data/elastic-agent-81948d/logs/elastic-agent-watcher-$(date +%Y%m%d).ndjson
    
  9. Notice that the Upgrade Watcher detects that the Agent service is not installed and exits early as a result.

    {"log.level":"warn","@timestamp":"2023-08-03T22:38:20.030Z","log.origin":{"file.name":"cmd/watch.go","file.line":195},"message":"Agent uninstall detected","ecs.version":"1.6.0"}
    {"log.level":"error","@timestamp":"2023-08-03T22:38:20.030Z","log.origin":{"file.name":"cmd/watch.go","file.line":107},"message":"Exiting early due to: %vElastic Agent was uninstalled: service is not installed","ecs.version":"1.6.0"}
    {"log.level":"error","@timestamp":"2023-08-03T22:38:20.030Z","log.origin":{"file.name":"cmd/watch.go","file.line":51},"message":"Watch command failed","error":{"message":"Elastic Agent was uninstalled: service is not installed"},"ecs.version":"1.6.0"}
    
  10. Verify that the service is, in fact, installed. It's just that the Agent process being governed by the service keeps crashing.

    systemctl status elastic-agent.service
    
  11. Cleanup: delete the branch we used to build the crashing Agent.

    git checkout main
    git branch -D crashing-agent
    

Attachments:

Metadata

Metadata

Assignees

Labels

Team:Elastic-AgentLabel for the Agent teambugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions