Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8sObjects events watch fails on 0.75.0 "Watch failed" err="unknown" #759

Closed
matthewmodestino opened this issue Apr 28, 2023 · 3 comments
Closed
Labels
bug Something isn't working Stale

Comments

@matthewmodestino
Copy link

What happened?

Description

Deployed 0.75.0 and enabled k8sObjects receiver to pull pods and watch events.

  k8sObjects:
    - name: pods
      mode: pull
      interval: 5m
    - name: events
      mode: watch
      group: events.k8s.io

Steps to Reproduce

Deploy 0.75.0 cluster receiver and enable k8sObjects config to watch k8s events

Expected Result

The cluster receiver should start and pull k8sevents and not spam this error continuously:

E0428 14:19:50.207608       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:51.207902       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:52.208284       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:53.208521       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:54.209286       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:55.209911       1 retrywatcher.go:130] "Watch failed" err="unknown"

Actual Result

Cluster receiver runs and pulls objects but does not successfully watch k8s api events.

Chart version

0.75.0

Environment information

Environment

Cloud: AWS EC2
k8s version: MicroK8s v1.26.3 revision 4959

kubectl get nodes
NAME   STATUS   ROLES    AGE   VERSION
so1    Ready    <none>   37m   v1.26.3

OS: Ubuntu 20.04

Chart configuration

k8sObjects:
    - name: pods
      mode: pull
      interval: 5m
    - name: events
      mode: watch
      group: events.k8s.io


### Log output

```shell
-04-28T14:32:17.858Z	info	k8sobjectsreceiver@v0.75.0/receiver.go:91	Started collecting	{"kind": "receiver", "name": "k8sobjects", "data_type": "logs", "gvr": "/v1, Resource=pods", "mode": "pull", "namespaces": []}
2023-04-28T14:32:17.858Z	info	k8sobjectsreceiver@v0.75.0/receiver.go:91	Started collecting	{"kind": "receiver", "name": "k8sobjects", "data_type": "logs", "gvr": "events.k8s.io/v1, Resource=events", "mode": "watch", "namespaces": []}


E0428 14:19:50.207608 1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:51.207902 1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:52.208284 1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:53.208521 1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:54.209286 1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:19:55.209911 1 retrywatcher.go:130] "Watch failed" err="unknown"

Additional context

Flipped on debug, didn't expose any more helpful info.

2023-04-28T14:32:19.667Z	debug	memorylimiterprocessor@v0.75.0/memorylimiter.go:284	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "metrics/collector", "cur_mem_mib": 38}
E0428 14:32:19.860191       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:32:20.860273       1 retrywatcher.go:130] "Watch failed" err="unknown"
2023-04-28T14:32:21.667Z	debug	memorylimiterprocessor@v0.75.0/memorylimiter.go:284	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "metrics/collector", "cur_mem_mib": 38}
E0428 14:32:21.860816       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:32:22.861132       1 retrywatcher.go:130] "Watch failed" err="unknown"
2023-04-28T14:32:23.667Z	debug	memorylimiterprocessor@v0.75.0/memorylimiter.go:284	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "metrics/collector", "cur_mem_mib": 39}
E0428 14:32:23.861469       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:32:24.861801       1 retrywatcher.go:130] "Watch failed" err="unknown"
2023-04-28T14:32:25.666Z	debug	memorylimiterprocessor@v0.75.0/memorylimiter.go:284	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "metrics/collector", "cur_mem_mib": 39}
E0428 14:32:25.861949       1 retrywatcher.go:130] "Watch failed" err="unknown"
E0428 14:32:26.862336       1 retrywatcher.go:130] "Watch failed" err="unknown"
@matthewmodestino matthewmodestino added the bug Something isn't working label Apr 28, 2023
@matthewmodestino
Copy link
Author

matthewmodestino commented Apr 28, 2023

Was able to resolve this thanks to @rmfitzpatrick!

https://github.com/signalfx/splunk-otel-collector-chart/blob/be3f7f6c858672d896e009017840b217bc755061/UPGRADING.md#0670-to-0680

Needed to add the rbac.customRules section to the "extra system configuration" section of the chart to ensure the right clusterrole rules were added.

rbac:
  customRules:
    - apiGroups:
      - "events.k8s.io"
      resources:
      - events
      verbs:
      - get
      - list
      - watch

Should probably be rendered by default, no?

@omrozowicz-splunk
Copy link
Contributor

Hey, I remember we initially did it the way to autogenerate, but there was a problem with finding out the version of apiGroup:
#588 (comment)

TL;DR
k8sobject receiver is smart and can figure out the apiGroup version, Helm chart in customRules is silly and requires it to be passed explicitly
We went down this road to not overcomplicate the configuration. What I think would help, though, is to return something meaningful instead of "Watch failed" err="unknown"

@github-actions
Copy link
Contributor

github-actions bot commented Oct 10, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

3 participants