Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add k8s object receiver #588

Merged
merged 8 commits into from
Jan 10, 2023

Conversation

omrozowicz-splunk
Copy link
Contributor

@omrozowicz-splunk omrozowicz-splunk commented Nov 21, 2022

Replacing k8seventsreceiver with k8objectsreceiver.

As using k8s objects receiver requires creating new rules in clusterRole.yaml, the PR contains splunk-otel-collector.clusterRole.rules which is the template function that aggregates and renders all the apiGroups rules with "get", "list", "watch" verbs, so there is no duplications.

It requires splunk-otel-collector with signalfx/splunk-otel-collector#2270 to be released.

There are slight indent differences in rendered templates because previously it was not entirely consistent (sometimes with whitespaces after apiGroups, sometimes not). With the change from thsi PR it's handled by the function:

- apiGroups:
  - {{ $groupName }}

Also, eventsEnabled name was changed to objectsEnabled so the name reflects better what actually we pull.

@omrozowicz-splunk omrozowicz-splunk changed the title add k8s objectr eceiver add k8s object receiver Nov 21, 2022
@omrozowicz-splunk omrozowicz-splunk marked this pull request as ready for review November 24, 2022 08:34
@omrozowicz-splunk omrozowicz-splunk changed the title add k8s object receiver [WIP] add k8s object receiver Nov 29, 2022
@rmfitzpatrick
Copy link
Contributor

Do you have any guidance and examples on the differences in reported telemetry between the two receivers? Presumably this is going to affect any user systems that depend on the events receiver.

@hvaghani221
Copy link
Contributor

Do you have any guidance and examples on the differences in reported telemetry between the two receivers? Presumably this is going to affect any user systems that depend on the events receiver.

k8sobject stores payload into the log body, but k8s_events only stores event.Message field as body and only adds following fields as resource/attribue

  • k8s.object.kind
  • k8s.object.name
  • k8s.object.uid
  • k8s.object.fieldpath
  • k8s.object.api_version
  • k8s.object.resource_version
  • k8s.event.reason
  • k8s.event.action
  • k8s.event.start_time
  • k8s.event.name
  • k8s.event.uid
  • k8s.namespace.name

k8s_event payload
image

k8sobjects payload
image

@dmitryax
Copy link
Contributor

k8sobject stores payload into the log body, but k8s_events only stores event.Message field as body and only adds following fields as resource/attribute

This needs to be well documented in https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md

@omrozowicz-splunk
Copy link
Contributor Author

omrozowicz-splunk commented Dec 2, 2022

One more thing, now sourcetype of events generated by k8sobjectsreceiver is kube:events. Do we want to keep this convention? We could generate the sourcetype based on the resource we pull/watch, following kube:object:<resource>, for ex. kube:object:pods. What do you think?

@dmitryax
Copy link
Contributor

dmitryax commented Dec 5, 2022

We could generate the sourcetype based on the resource we pull/watch, following kube:object:, for ex. kube:object:pods. What do you think?

Sorry, I'm not an expert when it comes to Splunk Platform specific fields. What are the expected values for sourcetype? Do we have any conventions for that?

And how do you want to achieve that? I think we can use the transform processor to compose sourcetype from event payload

@hvaghani221
Copy link
Contributor

hvaghani221 commented Dec 5, 2022

Sorry, I'm not an expert when it comes to Splunk Platform specific fields. What are the expected values for sourcetype? Do we have any conventions for that?

The source type is one of the default fields that the Splunk platform assigns to all incoming data. It tells the platform what kind of data you have, so that it can format the data intelligently during indexing. Source types also let you categorize your data for easier searching.

In SCK, we are also following kube:object:<resourcename> convention.

And how do you want to achieve that? I think we can use the transform processor to compose sourcetype from event payload

K8sobjects adds the resource name as an event.name attribute. I was thinking we can use logstransformprocessor
and use stanza's add operator to add com.splunk.sourcetype attribute.

There was a bug in k8sobject receiver. event.name attribute is empty when watch mode is used. I have created open-telemetry/opentelemetry-collector-contrib#16543 to fix that. Can you please review that as well?

@dmitryax
Copy link
Contributor

dmitryax commented Dec 5, 2022

@harshit-splunk thanks for explaining

In SCK, we are also following kube:object: convention.

Sounds good to me. Do we have to stick with what was adopted in SCK? Using k8s in k8s:object:<resourcename> I think is less confusing than kube.

K8sobjects adds the resource name as an event.name attribute. I was thinking we can use logstransformprocessor
and use stanza's add operator to add com.splunk.sourcetype attribute.

Let's better use transform processor since that is what will be the standard in OTel collector going forward

@hvaghani221
Copy link
Contributor

Do we have to stick with what was adopted in SCK? Using k8s in k8s:object:<resourcename> I think is less confusing than kube.

Agrred, but we are already using kube prefix for container logs (i.e. kube:container:<container name>. So, we should stick with same prefix.

Let's better use transform processor since that is what will be the standard in OTel collector going forward

Okay, makes sense.

@omrozowicz-splunk
Copy link
Contributor Author

Do we have to stick with what was adopted in SCK? Using k8s in k8s:object:<resourcename> I think is less confusing than kube.

Agrred, but we are already using kube prefix for container logs (i.e. kube:container:<container name>. So, we should stick with same prefix.

Let's better use transform processor since that is what will be the standard in OTel collector going forward

Okay, makes sense.

I've added setting sourcetype using transform processor, and it looks correct:
image

But we have two dependencies here, first one is k8sobjectreceiver that doesn't set event.name attribute correctly for watch, so it currently look like this in splunk:
image
We need to wait until this open-telemetry/opentelemetry-collector-contrib#16543 is merged

Another thing is that in transform processor was a syntax change between 0.64 and 0.65, so the current implementation works for 0.64 (notation with logs instead of logs_statement), but it won't work for 0.66

@dmitryax
Copy link
Contributor

dmitryax commented Dec 5, 2022

Agrred, but we are already using kube prefix for container logs (i.e. kube:container:. So, we should stick with same prefix.

Ok let's keep kube

We need to wait until this open-telemetry/opentelemetry-collector-contrib#16543 is merged

Reviewing it

Another thing is that in transform processor was a syntax change between 0.64 and 0.65, so the current implementation works for 0.64 (notation with logs instead of logs_statement), but it won't work for 0.66

It's just a configuration option, right? We will just need to update here it with the upgrade to 0.66. Please add a note comment

@omrozowicz-splunk omrozowicz-splunk changed the title [WIP] add k8s object receiver add k8s object receiver Dec 6, 2022
@omrozowicz-splunk
Copy link
Contributor Author

Another thing is that in transform processor was a syntax change between 0.64 and 0.65, so the current implementation works for 0.64 (notation with logs instead of logs_statement), but it won't work for 0.66

It's just a configuration option, right? We will just need to update here it with the upgrade to 0.66. Please add a note comment

Done

UPGRADING.md Outdated
## 0.64.0 to 0.66.0
Before this change, [k8s events receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8seventsreceiver) was used to collect Kubernetes Events.
Now we utilize [k8s objects receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sobjectsreceiver) that can pull or watch any object from Kubernetes API server.
Therefore, `clusterReceiver.eventsEnabled` is now deprecated, and to maintain the same behavior you should set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://stackoverflow.com/a/9208164

It appears these changes remove clusterReceiver.eventsEnabled functionality in total, making it obsolete. Is this correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right, k8s object receiver is not providing exactly the same functionality as k8s events receiver. Updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So #588 (comment) could also be expanded so that we provide an actual deprecation path by using the objects receiver w/ just the watched events for this value instead of obsoleting in one pass. The warning check also needs to be updated with a description of the behavior.

imo users should never have their config become "broken" during an upgrade and we should provide at least a release of warnings and ideally remediation logic in helpers (cc @dmitryax)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I updated it in the way that when eventsEnabled is set to true we return warning message and event objects are watched, so behaviour is similar (but not the same, as previous event formatting was different, and the sourcetype was kube:events and now it is kube:object<resource_id>. It is described in details in the upgrading doc.

@omrozowicz-splunk omrozowicz-splunk requested review from rmfitzpatrick and dmitryax and removed request for rmfitzpatrick and dmitryax December 12, 2022 08:13
@@ -0,0 +1,3 @@
pytest-expect file v1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expect fail is here because of open-telemetry/opentelemetry-collector-contrib#16543 which is not yet merged

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR functional to be merged with the current sourcetype? If so, can we add test_k8s_objects_sourcetype later instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently sourcetypes render correctly except watch case. So the only test case which is failing now is the one checking

    - name: events
      mode: watch

Yes, we can add it later. I deleted pytest expect and ("sourcetype", "kube:object:event", 1), from test_k8s_objects_sourcetype

UPGRADING.md Outdated Show resolved Hide resolved
UPGRADING.md Outdated Show resolved Hide resolved
UPGRADING.md Outdated Show resolved Hide resolved
@rmfitzpatrick
Copy link
Contributor

rmfitzpatrick commented Dec 13, 2022

k8s_event payload image

k8sobjects payload image

Looking at this more, as a user I would find this migration less than desirable and potentially fully breaking. The k8sobjects receiver could benefit from following the otel semantic conventions and maybe contribute new ones: https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/resource/k8s.yaml.

Given the substance of the changes could it make sense to not alter the events receiver usage at all but have an opt-in for the new objects receiver?

cc @dmitryax @jvoravong @atoulme

@dmitryax
Copy link
Contributor

dmitryax commented Dec 16, 2022

Given the substance of the changes could it make sense to not alter the events receiver usage at all but have an opt-in for the new objects receiver?

We can leave it with opt-in for now, but I believe we eventually need to deprecate the k8sevents receiver in favor of the k8sobjects receiver. The reason is that I don't think we should parse all the k8s event fields and compose attributes from it that can go with unnecessary high cardinality. Sending events in the format defined by the emitting system actually fits better the new events API that is currently being defined in the OTel spec. We will work on raising an issue about that for discussion in contrib with @harshit-splunk

… prefixed with events,

add the default configuration of objects to pull and watch for k8s objects receiver, update UPGRADING.md, add functional test for k8sobjects
… delete unnecessary drop_event_attrs, fix if condition for metrics cluster receiver config, add a TODO comment about transform processor, update UPGRADING.md, bring back events receiver, change toJson to toYaml and generate a new manifest run
Copy link
Contributor

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me. just couple nits

helm-charts/splunk-otel-collector/values.yaml Outdated Show resolved Hide resolved
helm-charts/splunk-otel-collector/values.yaml Outdated Show resolved Hide resolved
@omrozowicz-splunk omrozowicz-splunk requested review from dmitryax and removed request for hvaghani221 January 4, 2023 08:34
Copy link
Contributor

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a couple more nits, and I think we can merge it. Thanks for all the work!

@@ -231,7 +247,7 @@ service:
{{- end }}
{{- end }}

{{- if and $clusterReceiver.eventsEnabled (eq (include "splunk-otel-collector.logsEnabled" .) "true") }}
{{- if and $clusterReceiver.eventsEnabled (eq (include "splunk-otel-collector.logsEnabled" .) "true") }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, no change needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants