-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When a RR does not exist anymore telegraf inputs.dns_query produces errors in its logfile. #14941
Comments
Hi,
That version is many, many years old. Can you please update the version you are using. Based on my local build of master I get the following results: dns_query,domain=qwe.example.com,host=ryzen,rcode=NXDOMAIN,record_type=A,result=error,server=1.1.1.1 query_time_ms=105.99883,result_code=2i,rcode_value=3i 1709674110000000000
2024-03-05T21:28:29Z D! [outputs.file] Wrote batch of 1 metrics in 21.2µs
2024-03-05T21:28:29Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics |
Hi @powersj thanks for getting back. I checked with the same config and the latest version of telegraf and get the same results. Please have a look at the following: root@node_exp:~# cat /tmp/aaa.log $ telegraf --version |
To remind myself of this issue I re-ran this again and do now see:
It appears that the plugin will produce an error anytime there is a non-success response code. I'm still not clear this is actually an issue though as we are giving the opportunity to tell the user that something is up. In this case, you are trying to look something up that doesn't exist, it probably makes sense to error no? |
@powersj maybe we should log the error only once? |
I am still not clear on what the use-case of this issue is. Does @maintain3r know the endpoint does not exist and does not want errors in the logs? Or does this domain come and go? Why would you not want the error in the first place? If anything, my thought initial thought was to potentially allow filtering out certain error codes from the check. So if the user did not want to see |
Why? From my perspective it is just another line in the log that you can ignore if you don't need to worry about. From other users who typo a domain name they may really need to see that log message, go in and fix the hostname.
Are your alerts based on logs or metrics? |
Unless Im missing smth but I see no value in periodically throwing an error string in the log file repeating basically the same thing that's already seen in the metric. In the example I provided I put only CF dns srv and just one name to check, but in case I want to test the same record against a 3-4 different external dns servers I get even more noise, not to mention the fact that I can have more RRs which by the time or by mistake may get deleted by dns zone admins therefore making things even more noisy. Hope it makes sense :) |
I can understand, except this noise to you is a legitimate call to action for another user. Even if we only logged the error once my concern is this starts to hide or make it more difficult for others to view legit issues because they see it once in the logs and then go "well it must have gone away and isn't an issue anymore". I am not convinced we should remove the messages, nor should we log only once. I would consider a filter, because that is opt-in. |
@powersj That's fair! And maybe having a knob that will turn the logging on and off (on the plugin level only) will make things better? |
Allows the user to specify ignoring certain error types from printing in the logs. fixes: influxdata#14941
Relevant telegraf.conf
Logs from Telegraf
System info
Telegraf 1.12.6
Docker
No response
Steps to reproduce
Get Telegraf 1.12.6 and use config provided in this ticket.
Run telegraf with the config provided. It can be a telegraf docker img.
Check for the logs coming out of telegraf.
Expected behavior
Expected behaviour is just to reflect the fact in the metric dns_query_rcode_value with rcode="NXDOMAIN" and result="error".
Maybe there's a way to prevent the plugin from logging?
Actual behavior
2024-03-05T21:09:20Z E! [inputs.dns_query] Error in plugin: Invalid answer (NXDOMAIN) from 1.1.1.1 after A query for qwe.example.com
Additional info
Many thanks!
The text was updated successfully, but these errors were encountered: