-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telegraf MQTT input exits if Broker is not available on startup #3167
Comments
Hi @asciijungle, In order to connect, paho.mqtt.golang package is using net.DialTimeout method link Looks like there is a timeout set by default to 30s (
IMHO the best thing that we can do is to expose this variable as configurable in plugin configuration section. @danielnelson please let me know what do you think and I will prepare MR. Thanks |
I agree we probably should expose a Having a long timeout may not always be sufficient here though, since other errors are possible on the socket such as connection refused if it is not listening yet. So I also think we should remove the requirement that the initial connection be available when Telegraf starts. I think this is in line with expectations, since we don't have the same requirement for non-service inputs. |
Hi @danielnelson, could you please provide me more details about how to remove requirement for initial connection. As I understand (correct me if I'm wrong), we should provide some "retry connection loop" with predefined counter + back off mechanism to avoid situations when telegraf quits because endpoint is not ready. Do you have something already implemented in some plugin? Thanks! |
I think in this case retry is provided by the client library, though I don't know the details of how it works. I am basing this on this bit of code:
So hopefully we only need to remove the part where we wait for the Connect function to complete, and the client code will do the backoff and retry loop for us, though this will need to be verified. |
There is only one function ( |
So the library handles reconnects if the initial Perhaps if we introduce a connected boolean to the plugin we can attempt to Connect if we haven't connected yet in the |
Add connection_timeout option that is corresponding with the same option in MQTT library. Add connect() function in order to provide reconnect functionality and to remove requirement for initial connection to be available when telegraf starts.
As far as I know it is like this. I'm working on it in #3202 to accommodate what you have mention. Could you please take a look? |
Add connection_timeout option that is corresponding with the same option in MQTT library. Add connect() function in order to provide reconnect functionality and to remove requirement for initial connection to be available when telegraf starts.
This reverts commit d37028a.
Thanks @DanKans, I'm going to add this to the 1.4.1 release |
This worked for me. https://it-obey.com/index.php/connecting-telegraf-to-mosquitto-with-influxdb/ Mqtt needs to allow incoming connection (mosquitto.conf) |
Bug report
Relevant telegraf.conf:
System info:
latest docker container pulled from docker hub.
Steps to reproduce:
docker run -v "$(pwd)/telegraf.conf:/etc/telegraf/telegraf.conf:ro" telegraf:latest
Expected behavior:
telegraf tries to reconnect until it reaches a configurable timeout.
Actual behavior:
telegraf exists after a couple of seconds. This is not enough time for a mqtt broker to start up in a docker-compose scenario. The timeout can not be extended by the configuration defined in mqtt_consumer.go
I believe there is only one connection attempt being made. I'd like to be able to start my whole stack consisting of influxdb, telegraf, the mqtt broker and grafana to be launched in a single docker-compose stack. As of now this is unfortunately not possible.
Additional info:
The text was updated successfully, but these errors were encountered: