Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics are still received after Grafana Agent stops #643

Open
acsgn opened this issue Oct 18, 2024 · 0 comments
Open

Metrics are still received after Grafana Agent stops #643

acsgn opened this issue Oct 18, 2024 · 0 comments

Comments

@acsgn
Copy link

acsgn commented Oct 18, 2024

Bug Description

Hi,
While doing some testing for a customer, I realized that the metrics from machine deployments continue to be received after the Grafana Agent is already stopped or the machine itself is down. This only stops after 4 minutes which in our use, renders the alerts not so useful.

To Reproduce

multipass launch --cpus 4 --memory 8G --disk 30G --name cos-test 22.04
multipass shell cos-test

HOST_IP=$(hostname -I | cut -d ' ' -f 1)

lxd init --auto
lxc network set lxdbr0 ipv6.address none

sudo snap install microk8s --channel 1.30-strict
sudo microk8s enable hostpath-storage
sudo microk8s enable metallb:$HOST_IP-$HOST_IP

sudo snap install juju
mkdir -p ~/.local/share
juju bootstrap localhost overlord
sudo microk8s config | juju add-k8s k8s --controller overlord

juju add-model zookeeper localhost
juju add-model cos k8s

juju deploy -m zookeeper zookeeper
juju deploy -m zookeeper grafana-agent
juju relate -m zookeeper zookeeper grafana-agent

juju deploy -m cos cos-lite --trust
juju offer cos.grafana:grafana-dashboard
juju offer cos.loki:logging
juju offer cos.prometheus:receive-remote-write

juju consume -m zookeeper cos.prometheus
juju consume -m zookeeper cos.loki
juju consume -m zookeeper cos.grafana
juju relate -m zookeeper grafana-agent grafana
juju relate -m zookeeper grafana-agent loki
juju relate -m zookeeper grafana-agent prometheus

juju run -m cos grafana/leader get-admin-password
## Collect metrics for a while and browse Grafana before proceeding

juju ssh -m zookeeper grafana-agent/leader
date && sudo snap stop grafana-agent
exit

## Go back to Grafana and observe that the metrics are still "received" for 4 minutes after the service stops
## You can also try stoppping the LXC container, both causes to the same ghost metrics
## You can use the Explore tab and following metric with Prometheus as the data source
## zookeeper_QuorumSize OR up{juju_application="zookeeper"}

exit
multipass stop cos-test
multipass delete --purge cos-test

Environment

The reproduce steps are using multipass, I encountered the same behavior on local machine, GCP and AWS. All snaps and charms use the latest/stable.

Relevant log output

No logs are available

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant