node_textfile_scrape_error not reported when failing to read metrics due to inconsistent HELP texts

### Host operating system:
Linux tau-gfa-uat 5.4.0-62-generic #70-Ubuntu SMP Tue Jan 12 12:45:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

### node_exporter version:
node_exporter, version 1.3.1 (branch: HEAD, revision: a2321e7b940ddcff26873612bccdf7cd4c42b6b6)
  build user:       root@243aafa5525c
  build date:       20211205-11:09:49
  go version:       go1.17.3
  platform:         linux/amd64

### node_exporter command line flags
```
/usr/local/bin/node_exporter \
  --collector.disable-defaults \
  --collector.cpu \
  --collector.cpufreq \
  --collector.diskstats \
  --collector.edac \
  --collector.filefd \
  --collector.filesystem \
  --collector.filesystem.fs-types-exclude \
  '^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|fuse.*|hugetlbfs|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$' \
  --collector.hwmon \
  --collector.loadavg \
  --collector.mdadm \
  --collector.meminfo \
  --collector.netdev \
  --collector.netstat \
  --collector.pressure \
  --collector.processes \
  --collector.schedstat \
  --collector.sockstat \
  --collector.stat \
  --collector.textfile \
  --collector.textfile.directory \
  /var/lib/node_exporter_textfile \
  --collector.time \
  --collector.timex \
  --collector.vmstat \
  --web.disable-exporter-metrics \
  --web.listen-address \
  :9100
```

### Are you running node_exporter in Docker?
No

### What did you do that produced an error?
For technical reasons, different parts of our system create different files for the same metric, but different labels. Due to code divergence, the files had inconsistent HELP messages.

For example:
```
# cat tau_infrastructure_performing_maintenance_task_foo.prom 
# HELP tau_infrastructure_performing_maintenance_task The server is performing some long-running maintenance task
# TYPE tau_infrastructure_performing_maintenance_task gauge
tau_infrastructure_performing_maintenance_task{main_task="deployment", sub_task="deployment_ansible", start_or_stop="start"} 1645624007.0
```
and
```
# cat tau_infrastructure_performing_maintenance_task_bar.prom 
# HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
# TYPE tau_infrastructure_performing_maintenance_task gauge
tau_infrastructure_performing_maintenance_task{main_task="nightly",sub_task="main",start_or_stop="start"} 1647280801.98446
```

### What did you expect to see?

I would expect either:

1. To get all metrics, with node_exporter arbitrarily choosing one of the two HELP texts where there is a difference, or,
2. if some metrics are dropped, for node_textfile_scrape_error to return non-zero for this job

### What did you see instead?

The node_exporter logs an error about the situation, but not all of the metrics are scraped by prometheus, but node_textfile_scrape_error is nevertheless zero despite there being an issue preventing metrics from being exported.

```
Mar 15 10:43:25 REDACTED node_exporter[733184]: ts=2022-03-15T08:43:25.153Z caller=stdlib.go:105 level=error msg="error gathering metrics: 4 error(s) occurred:\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"start\" > label:<name:\"sub_task\" value:\"main\" > gauge:<value:1.64728080198446e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\"\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"stop\" > label:<name:\"sub_task\" value:\"main\" > gauge:<value:1.64728240041946e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\"\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"start\" > label:<name:\"sub_task\" value:\"reporting\" > gauge:<value:1.64728080229161e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\"\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"stop\" > label:<name:\"sub_task\" value:\"reporting\" > gauge:<value:1.64728239993993e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\""
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node_textfile_scrape_error not reported when failing to read metrics due to inconsistent HELP texts #2317

Host operating system:

node_exporter version:

node_exporter command line flags

Are you running node_exporter in Docker?

What did you do that produced an error?

What did you expect to see?

What did you see instead?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development