Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error has occurred during metrics collection #1

Open
itatabitovski opened this issue Sep 8, 2016 · 6 comments
Open

An error has occurred during metrics collection #1

itatabitovski opened this issue Sep 8, 2016 · 6 comments

Comments

@itatabitovski
Copy link
Contributor

I tried running this exporter but I am getting the following error

An error has occurred during metrics collection:

4 error(s) occurred:
* collected metric nomad_allocation_cpu label:<name:"alloc" value:"infra/statsd-exporter.statsd-exporter[0]" > label:<name:"group" value:"statsd-exporter" > label:<name:"job" value:"infra/statsd-exporter" > gauge:<value:4.02303193877551 >  was collected before with the same name and label values
* collected metric nomad_allocation_cpu_throttle label:<name:"alloc" value:"infra/statsd-exporter.statsd-exporter[0]" > label:<name:"group" value:"statsd-exporter" > label:<name:"job" value:"infra/statsd-exporter" > gauge:<value:0 >  was collected before with the same name and label values
* collected metric nomad_allocation_memory label:<name:"alloc" value:"infra/statsd-exporter.statsd-exporter[0]" > label:<name:"group" value:"statsd-exporter" > label:<name:"job" value:"infra/statsd-exporter" > gauge:<value:2.2781952e+07 >  was collected before with the same name and label values
* collected metric nomad_allocation_memory_limit label:<name:"alloc" value:"infra/statsd-exporter.statsd-exporter[0]" > label:<name:"group" value:"statsd-exporter" > label:<name:"job" value:"infra/statsd-exporter" > gauge:<value:256 >  was collected before with the same name and label values

i believe this is happens when there are several older allocations

nomad status infra/statsd-exporter
ID          = infra/statsd-exporter
Name        = infra/statsd-exporter
Type        = service
Priority    = 50
Datacenters = ovh
Status      = running
Periodic    = false

Summary
Task Group       Queued  Starting  Running  Failed  Complete  Lost
statsd-exporter  0       0         1        0       0         0

Allocations
ID        Eval ID   Node ID   Task Group       Desired  Status    Created At
57ec626a  60bc583d  375d5aaf  statsd-exporter  run      running   09/08/16 10:13:30 UTC
47ce6dd6  2e863db7  375d5aaf  statsd-exporter  stop     complete  09/08/16 09:28:16 UTC
16dc534e  5913852f  22defaf9  statsd-exporter  stop     complete  09/05/16 11:55:17 UTC

after manually triggering garbage collection the old allocations were gone and the exporter worked.

curl -X PUT  http://localhost:4646/v1/system/gc

nomad status infra/statsd-exporter
ID          = infra/statsd-exporter
Name        = infra/statsd-exporter
Type        = service
Priority    = 50
Datacenters = ovh
Status      = running
Periodic    = false

Summary
Task Group       Queued  Starting  Running  Failed  Complete  Lost
statsd-exporter  0       0         1        0       0         0

Allocations
ID        Eval ID   Node ID   Task Group       Desired  Status   Created At
57ec626a  60bc583d  375d5aaf  statsd-exporter  run      running  09/08/16 10:13:30 UTC



@Nomon
Copy link
Owner

Nomon commented Sep 8, 2016

Might need to add the allocation id as a label to the allocations or alternatively only collect from running allocations to ensure the uniqueness of the name + labels (job_name,group_name,alloc_name[alloc_index]). I will take a closer look later today.

@itatabitovski
Copy link
Contributor Author

I think only running allocations are of interest as they are the only ones that with interesting metrics.

I don't know if the alloc index is necessary, isn't alloc name already unique?

@Nomon
Copy link
Owner

Nomon commented Sep 9, 2016

the alloc index is included in the alloc name, if a group has count = 10 then the allocs have a name of task_name[alloc_index 0..9]

@itatabitovski
Copy link
Contributor Author

Sorry my mistake, I meant allocation ID, not index.

Would it be of interest to have allocation by type? Right now nomad_allocations shows all allocations.

I don't know if it would be interesting to have
nomad_allocations{status="running|completed"}

@Nomon
Copy link
Owner

Nomon commented Dec 9, 2016

might be useful, would allow monitoring queued counts etc. Same could perhaps be extended to nodes. And we could add evaluations by status as well in the future. We should go through the information the builtin stats providers (statsite, statsd, datadog etc) expose and try to emulate those to some extent.

@pznamensky
Copy link

pznamensky commented Aug 21, 2017

Hi
The same error with nomad_serf_lan_member_status:

An error has occurred during metrics collection:

collected metric nomad_serf_lan_member_status label:<name:"class" value:"" > label:<name:"datacenter" value:"staging" > label:<name:"drain" value:"false" > label:<name:"node" value:"<cluster_member_hostname_here>" > gauge:<value:0 >  was collected before with the same name and label values

As you can see there is two servers with the same name.
I guess on of nomad's agent were losted and then executed new one.

~ $ nomad node-status
ID        DC        Name                    Class   Drain  Status
13c89393  staging  app1.test.local         <none>  false  ready
bcb94e93  staging  app1.test.local         <none>  false  down
98f7b583  staging  app3.test.local         <none>  false  ready
869ba8a7  staging  app7.test.local         <none>  false  ready
a5bac338  staging  app9.test.local         <none>  false  ready
f5cc2390  staging  app5.test.local         <none>  false  ready
28ed1f83  staging  app13.test.local        <none>  false  ready
0e71ac4e  staging  app11.test.local        <none>  false  ready

Didn't you think about adding label node_id?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants