-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KNX Listener stops receiving messages #14695
Comments
Hi, Are you using any processors? This is why we ask for your full config ;) I am asking because the listener uses tracking metrics and will only consume up to the max number of messages config option. If a processor is not correctly handling tracking metrics, then it can result in this apparent hang. |
Also what version of telegraf? |
Version: docker latest (see my docker compose):
|
No processors. Full config: https://gist.github.com/alexander-zimmermann/e663d8810416555374dc05a63074d8d1 |
I am going to have to build you a debug version to understand what is going on then. If I put up a PR with build artifacts, can I have you try using it? |
sure. |
@powersj do you have something I can test? |
@alexander-zimmermann thanks for the ping, I completely forgot about this. You can find a debug build artifacts at #14906 in 20-30mins. Essentially, I have added some additional logging around message handling to see if we were silently dropping something. Because this is a listener you may need to get a network trace to see if any messages are actually coming in should this debug build not provide any additional information. I would also suggest trying to run this outside of docker if possible to ensure that is also not an issue. |
@powersj I run your build artifact for +- 24 hours now. No problems at all. Any idea how we can further debug the problem? |
Hmm that is interesting as all that PR does is add some additionally logging. Would you be willing to share the logs? You don't have to post them here, you can email me at jpowers at influxdata.com. Are you seeing any ignoring message logs? |
I shared the logs w/ you. Cron will restart my docker telegraf every hour currently. Let me disable that over the weekend and see if the problem still exist w/ telegraf 1.29.5. My currently guess is that knx-go looses the connection and does not reestablish it. See also: vapourismo/knx-go#70 But if this is the case, why I don't have it w/o docker not (plain telegraf). |
Thanks for the logs. It does seem to continue to get messages, unlike the original logs. Thanks also for the knx-go thread. I found this message interesting. Sounded like there was an opportunity to improve the retry connection logic in knx-go and/or somehow for telegraf to tell the Inbound has closed and attempt to reopen? Let me know how the tests go over the weekend. |
It's stable since Friday. I continuously get messages |
@alexander-zimmermann and @powersj I think I know what happened here... The plugin connects to the tunnel in |
@alexander-zimmermann please try the binary in #14959 available once CI finished the tests. You can trigger the reconnect by pulling and replugging the network cable or otherwise interrupt the network connection between Telegraf and the KNX interface. Let me know if this fixes your issue! |
Relevant telegraf.conf
Logs from Telegraf
System info
Host Windows 11 Pro Hyper-V, Ubuntu 22.04 VM, Docker 25.0.2
Docker
telegraf:
image: telegraf
container_name: telegraf
hostname: telegraf
restart: unless-stopped
volumes:
- ./telegraf/config/telegraf.conf:/etc/telegraf/telegraf.conf:ro
- ./telegraf/config/inputs.knx.conf:/etc/telegraf/telegraf.d/inputs.knx.conf:ro
- /etc/ssl/certs/ca-certificates.crt:/etc/ssl/ca-certificates.crt:ro
secrets:
- telegraf_influxdb_token
command:
- --config=/etc/telegraf/telegraf.conf
- --config-directory=/etc/telegraf/telegraf.d/
depends_on:
- influxdb
Steps to reproduce
...
Expected behavior
See incoming messages...
Actual behavior
The KNX bus is up-and-running. I also run xknx on the Ubuntu VM in parallel . No issues.
Additional info
No response
The text was updated successfully, but these errors were encountered: