Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Hourly presence change events received on master #64505

Open
2 of 9 tasks
gbunt opened this issue Jun 19, 2023 · 7 comments
Open
2 of 9 tasks

[BUG] Hourly presence change events received on master #64505

gbunt opened this issue Jun 19, 2023 · 7 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt needs-triage Regression The issue is a bug that breaks functionality known to work in previous releases.

Comments

@gbunt
Copy link

gbunt commented Jun 19, 2023

Description
We're currently running into the following with v3006.1: Regardless of any actual changes, the master receives presence/change every hour for all minions. This is not the behavior we had in v3002, and presence change events in our environment can trigger loads of orchestrate states to run, and all at once now.

Setup
(Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)

Please be as specific as possible and give set-up details.

  • on-prem machine
  • VM (Virtualbox, KVM, etc. please specify)
  • VM running on a cloud service, please be explicit and add details
  • container (Kubernetes, Docker, containerd, etc. please specify) LXD
  • or a combination, please be explicit
  • jails if it is FreeBSD
  • classic packaging
  • onedir packaging
  • used bootstrap to install

Steps to Reproduce the behavior
debug logging:

2023-06-18 04:04:43,670 [salt.utils.event :823 ][DEBUG   ][22091] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T02:04:43.670825'}
2023-06-18 05:05:16,638 [salt.utils.event :823 ][DEBUG   ][24066] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T03:05:16.638313'}
2023-06-18 06:05:49,553 [salt.utils.event :823 ][DEBUG   ][25224] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T04:05:49.553452'}
2023-06-18 07:06:22,526 [salt.utils.event :823 ][DEBUG   ][26719] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T05:06:22.526803'}
2023-06-18 08:06:55,540 [salt.utils.event :823 ][DEBUG   ][28012] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T06:06:55.540804'}
2023-06-18 09:07:28,490 [salt.utils.event :823 ][DEBUG   ][29283] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T07:07:28.490620'}
2023-06-18 10:08:01,368 [salt.utils.event :823 ][DEBUG   ][30765] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T08:08:01.368235'}
2023-06-18 11:08:34,233 [salt.utils.event :823 ][DEBUG   ][31988] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T09:08:34.233706'}
2023-06-18 12:09:07,234 [salt.utils.event :823 ][DEBUG   ][33340] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T10:09:07.234497'}
2023-06-18 13:09:40,190 [salt.utils.event :823 ][DEBUG   ][34530] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T11:09:40.190707'}
2023-06-18 14:10:13,058 [salt.utils.event :823 ][DEBUG   ][36180] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T12:10:13.058599'}
2023-06-18 15:10:46,088 [salt.utils.event :823 ][DEBUG   ][37798] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T13:10:46.088279'}
2023-06-18 16:11:19,078 [salt.utils.event :823 ][DEBUG   ][39355] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T14:11:19.078218'}
2023-06-18 17:11:52,016 [salt.utils.event :823 ][DEBUG   ][40622] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T15:11:52.016171'}
2023-06-18 18:12:24,950 [salt.utils.event :823 ][DEBUG   ][42013] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T16:12:24.950058'}
2023-06-18 19:12:57,931 [salt.utils.event :823 ][DEBUG   ][43556] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T17:12:57.931541'}
2023-06-18 20:13:30,817 [salt.utils.event :823 ][DEBUG   ][44718] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T18:13:30.816871'}
2023-06-18 21:14:03,799 [salt.utils.event :823 ][DEBUG   ][46482] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T19:14:03.799817'}
2023-06-18 22:14:46,898 [salt.utils.event :823 ][DEBUG   ][47974] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T20:14:46.898101'}
2023-06-18 23:15:19,799 [salt.utils.event :823 ][DEBUG   ][49405] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T21:15:19.799625'}
2023-06-19 00:15:52,709 [salt.utils.event :823 ][DEBUG   ][50522] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T22:15:52.709098'}
2023-06-19 01:16:25,736 [salt.utils.event :823 ][DEBUG   ][52154] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T23:16:25.736522'}
2023-06-19 02:16:58,649 [salt.utils.event :823 ][DEBUG   ][53452] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T00:16:58.649243'}
2023-06-19 03:17:31,533 [salt.utils.event :823 ][DEBUG   ][54792] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T01:17:31.533062'}
2023-06-19 04:18:04,528 [salt.utils.event :823 ][DEBUG   ][56938] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T02:18:04.528774'}
2023-06-19 05:18:37,484 [salt.utils.event :823 ][DEBUG   ][58051] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T03:18:37.484359'}
2023-06-19 06:19:10,401 [salt.utils.event :823 ][DEBUG   ][59426] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T04:19:10.401127'}
2023-06-19 07:19:43,267 [salt.utils.event :823 ][DEBUG   ][60559] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T05:19:43.267281'}
2023-06-19 08:20:16,253 [salt.utils.event :823 ][DEBUG   ][61818] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T06:20:16.253843'}
2023-06-19 09:20:49,215 [salt.utils.event :823 ][DEBUG   ][14118] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T07:20:49.215430'}
2023-06-19 10:21:32,676 [salt.utils.event :823 ][DEBUG   ][22258] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T08:21:32.676621'}
2023-06-19 11:22:05,217 [salt.utils.event :823 ][DEBUG   ][25328] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T09:22:05.217491'}
2023-06-19 12:22:38,102 [salt.utils.event :823 ][DEBUG   ][31828] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T10:22:38.102539'}
2023-06-19 13:23:11,098 [salt.utils.event :823 ][DEBUG   ][37177] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T11:23:11.097996'}
2023-06-19 14:23:44,048 [salt.utils.event :823 ][DEBUG   ][38463] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T12:23:44.048204'}

Relevant master config:

presence_events: True

tcp_keepalive: True
tcp_keepalive_idle: 10
tcp_keepalive_cnt: 3
tcp_keepalive_intvl: 10

Expected behavior
presence/change events are only fired when minions are actually disconnected / gone or initiate a new TCP connection

Screenshots
If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
          Salt: 3006.1
 
Python Version:
        Python: 3.10.11 (main, May  5 2023, 02:31:54) [GCC 11.2.0]
 
Dependency Versions:
          cffi: 1.14.6
      cherrypy: unknown
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.9.8
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 5.4.1
         PyZMQ: 23.2.0
        relenv: 0.12.3
         smmap: Not Installed
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4
 
System Versions:
          dist: centos 7.6.1810 Core
        locale: utf-8
       machine: x86_64
       release: 5.15.0-56-generic
        system: Linux
       version: CentOS Linux 7.6.1810 Core

Additional context
Add any other context about the problem here.

@gbunt gbunt added Bug broken, incorrect, or confusing behavior needs-triage labels Jun 19, 2023
@OrangeDog OrangeDog added Regression The issue is a bug that breaks functionality known to work in previous releases. Core relates to code central or existential to Salt labels Jun 20, 2023
@OrangeDog
Copy link
Contributor

Possibly caused by #63941

@raddessi
Copy link
Contributor

Possibly caused by #63941

Agreed! Possible fix at #64508

@stephanwiechert
Copy link

stephanwiechert commented Mar 12, 2024

We have the same issue here:
Salt Minion Versions:
Salt Version:
Salt: 3006.3

Python Version:
Python: 3.10.13 (main, Sep 6 2023, 02:11:27) [GCC 11.2.0]

Dependency Versions:
cffi: 1.14.6
cherrypy: 18.6.1
dateutil: 2.8.1
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.2
libgit2: Not Installed
looseversion: 1.0.2
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 22.0
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.9.8
pygit2: Not Installed
python-gnupg: 0.4.8
PyYAML: 6.0.1
PyZMQ: 23.2.0
relenv: 0.13.10
smmap: Not Installed
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.4

System Versions:
dist: ubuntu 22.04.4 jammy
locale: utf-8
machine: x86_64
release: 5.15.0-97-generic
system: Linux
version: Ubuntu 22.04.4 jammy

Salt Master Version:

Salt Version:
Salt: 3006.7

Python Version:
Python: 3.10.13 (main, Feb 19 2024, 03:31:20) [GCC 11.2.0]

Dependency Versions:
cffi: 1.14.6
cherrypy: unknown
dateutil: 2.8.1
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.3
libgit2: Not Installed
looseversion: 1.0.2
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 22.0
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.19.1
pygit2: Not Installed
python-gnupg: 0.4.8
PyYAML: 6.0.1
PyZMQ: 23.2.0
relenv: 0.15.1
smmap: Not Installed
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.4

System Versions:
dist: ubuntu 22.04.4 jammy
locale: utf-8
machine: x86_64
release: 5.15.0-97-generic
system: Linux
version: Ubuntu 22.04.4 jammy

About every hour all minions marked as "new" within event "/presence/changed" although there was no "lost" event before.

We are using a python script as systemd service to alert us in case of presence change:

import logging
import salt.config
import salt.utils.event
from create_opsgenie_alert import create_opsgenie_alerts

logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%d.%m.%Y %H:%M:%S ', level=logging.INFO)

opts = salt.config.client_config("/etc/salt/master/")

event = salt.utils.event.get_event("master", sock_dir=opts["sock_dir"], opts=opts)

for data in event.iter_events(tag="salt/presence/change"):
logging.info("TAG = salt/presence/change")
logging.info("Data:\n" + data + "\n")
logging.info(data.get('new', '--'))
logging.info(data.get('lost', '--'))
# print(data.get('_stamp'))
alertmsg = "Salt Minion Presence State changed..."
message = "Minion presence changed:\n new: {},\n lost: {}".format(data.get('new'), data.get('lost'))
logging.info(message)
# fire opsgenie alert
create_opsgenie_alerts(alertmsg,'Message:\n {}'.format(message), "Salt Minion State Change Alert","P3")

@raddessi
Copy link
Contributor

raddessi commented Apr 4, 2024

Apologies, I need to update the tests for the fix in #64508 still then it should be ready to merge. My evenings will be busy for a while so I can't say when I will have time to do this

@stephanwiechert
Copy link

Any news here? We handling every day 24 alerts because of this issue because we are checking the salt minion status with that event.
It would be nice to have a workaround until we have a fix, if there is one.

@raddessi
Copy link
Contributor

raddessi commented Jun 5, 2024

Sorry, I have an infant taking up all my spare time :) There is only one test case needed to be added if I recall. I can write down what it is if you want to take a stab at it. It's possible it will require a logic change but I don't recall honestly, it's been to long

@raddessi
Copy link
Contributor

raddessi commented Jun 5, 2024

You can always patch your master with the change from #64508, that is what I have done locally FWIW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt needs-triage Regression The issue is a bug that breaks functionality known to work in previous releases.
Projects
None yet
Development

No branches or pull requests

4 participants