-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloudwatch input changes for 1.20 are breaking #10027
Comments
@cthiel42 Was this also an issue with 1.20.1 and 1.20.2? Are you doing anything else with the dockerfile? Note that in the 1.20.3 we stopped running telegraf as the root user. |
@powersj Given that 1.20.2 was released on October 7th and I did my last deployment on October 12th with no problems, it appears like the issue came about in 1.20.3. And I'm not doing anything with a dockerfile here, I'm just pulling and using your latest image. I also wouldn't think this would be a user issue, as the plugin is able to communicate with AWS; it returns the available time series so it should be getting the correct values from AWS. But I also don't know what else goes on behind the scenes with this plugin. |
@cthiel42 ok can you run this in debug mode (e.g. |
@powersj Not much substance to the debug logs. I did confirm that I was getting the same issue while this was running.
And here's the exact config I was using for this test
|
I am seeing this issue as well - it doesn't appear to be zero'd out but rather it appears that all values are for each metric is the same (which of course can be zero) I reverted to 1.20.2 and this issue is not present on that version, it seems to be related to #9647 |
Hi, I have been able to reproduce this locally, I think. I have a PR up that adds some debugging code as well as updates the dependencies for AWS-related packages a bit further. I was started to get metrics from that. Could someone please try the artifacts linked in the comment on PR #10051? Thanks! |
I have tested the .deb artifact ( |
Can you confirm you were running with |
* origin/master: (133 commits) chore: restart service if it is already running and upgraded via RPM (influxdata#9970) feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#10237) fix: Handle duplicate registration of protocol-buffer files gracefully. (influxdata#10188) fix(http_listener_v2): fix panic on close (influxdata#10132) feat: add Vault input plugin (influxdata#10198) feat: support aws managed service for prometheus (influxdata#10202) fix: Make telegraf compile on Windows with golang 1.16.2 (influxdata#10246) Update changelog feat: Modbus add per-request tags (influxdata#10231) fix: Implement NaN and inf handling for elasticsearch output (influxdata#10196) feat: add nomad input plugin (influxdata#10106) fix: Print loaded plugins and deprecations for once and test (influxdata#10205) fix: eliminate MIB dependency for ifname processor (influxdata#10214) feat: Optimize locking for SNMP MIBs loading. (influxdata#10206) feat: Add SMART plugin concurrency configuration option, nvme-cli v1.14+ support and lint fixes. (influxdata#10150) feat: update configs (influxdata#10236) fix(inputs/kube_inventory): set TLS server name config properly (influxdata#9975) fix: Sudden close of Telegraf caused by OPC UA input plugin (influxdata#10230) fix: bump github.com/eclipse/paho.mqtt.golang from 1.3.0 to 1.3.5 (influxdata#9913) fix: json_v2 parser timestamp setting (influxdata#10221) fix: ensure graylog spec fields not prefixed with '_' (influxdata#10209) docs: remove duplicate links in CONTRIBUTING.md (influxdata#10218) fix: pool detection and metrics gathering for ZFS >= 2.1.x (influxdata#10099) fix: parallelism fix for ifname processor (influxdata#10007) chore: Forbids "log" package only for aggregators, inputs, outputs, parsers and processors (influxdata#10191) docs: address documentation gap when running telegraf in k8s (influxdata#10215) feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#10211) fix: mqtt topic extracting no longer requires all three fields (influxdata#10208) fix: windows service - graceful shutdown of telegraf (influxdata#9616) feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#10201) feat: Modbus support multiple slaves (gateway feature) (influxdata#9279) fix: Revert unintented corruption of the Makefile from influxdata#10200. (influxdata#10203) chore: remove triggering update-config bot in CI (influxdata#10195) Update changelog feat: Implement deprecation infrastructure (influxdata#10200) fix: extra lock on init for safety (influxdata#10199) fix: resolve influxdata#10027 (influxdata#10112) fix: register bigquery to output plugins influxdata#10177 (influxdata#10178) fix: sysstat use unique temp file vs hard-coded (influxdata#10165) refactor: snmp to use gosmi (influxdata#9518) ...
Relevent telegraf.conf
System info
1.20.3 Docker Image
Docker
No response
Steps to reproduce
Expected behavior
The cloudwatch input should query and return all the available metrics and their corresponding time series values based on your conditions.
Actual behavior
All of the time series are returned like normal, except they all contain a value of 0.
Additional info
After doing a docker pull for the latest telegraf image and refreshing the container, all cloudwatch metrics went to 0. I noticed in the change log that there were some changes to the cloudwatch input recently. I refreshed telegraf last on October 12th, so the only changes to the cloudwatch input according to the changelog would be as a result of #9647.
I resolved my issue by specifying version 1.19 in order to avoid all the cloudwatch input changes over the past month or two. I attached a screenshot of a panel from one of our dashboards. Note the behavior is that the metric is still returned, its value is just 0. This happened for all cloudwatch metrics collected by telegraf.
The text was updated successfully, but these errors were encountered: