Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/k8sattributes] Deprecate and remove FieldExtractConfig.Regex config option #25128

Open
dmitryax opened this issue Aug 9, 2023 · 17 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers never stale Issues marked with this label will be never staled and automatically removed priority:p2 Medium processor/k8sattributes k8s Attributes processor

Comments

@dmitryax
Copy link
Member

dmitryax commented Aug 9, 2023

The configuration interface of the k8sattributes receiver is over-complicated. This is an attempt to simplify it a bit.

FieldExtractConfig have KeyRegex and Regex options, which is pretty confusing. The Regex config option doesn't seem to be very important. I can be easily replaced with a transform processor with replace_pattern and replace_match functions. Therefore, I suggest:

  1. Deprecate FieldExtractConfig.Regex option with instructions on how to replace it with the transform processor
  2. Delete the option after 5 releases.
@dmitryax dmitryax added enhancement New feature or request priority:p2 Medium processor/k8sattributes k8s Attributes processor help wanted Extra attention is needed good first issue Good for newcomers labels Aug 9, 2023
@dmitryax dmitryax changed the title [processors/k8sattributes] Deprecated and remove FieldExtractConfig.Regex config option [processors/k8sattributes] Deprecate and remove FieldExtractConfig.Regex config option Aug 10, 2023
@bryan-aguilar
Copy link
Contributor

bryan-aguilar commented Aug 11, 2023

Doc for config options.

This seems to place a requirement for a collector build to include the transform processor to match this behavior. Is introducing the requirement of a separate processor the best choice for all users? Or should the k8sattr processor be able to do this in isolation?

I think this could turn into a more meta question for the collector community. How do we decide that a dependency on another component is the correct choice?

For example for the tail sampling processor, there's a soft (hard?) dependency on the group by trace processor. I don't even know if the group by tail sampling processor example is correct. That relationship may just be a "best practice" per say. The tail sampling processor has it's own decision_wait configuration.

@dmitryax
Copy link
Member Author

dmitryax commented Aug 11, 2023

This one is a pretty straightforward separation of concerns. Parsing of an attribute value has nothing to do with k8s. So it should not be part of this processor. It was added a long time ago as extra functionality before we had any option to parse attribute values. Now we have the transform processor, which includes this particular feature. So we are not putting any dependency on another component, we are just removing functionality that doesn't belong to k8s attributes processor. Please let me know if it makes sense.

@dmitryax
Copy link
Member Author

dmitryax commented Aug 11, 2023

It may become less performant after this change, but I believe it's more important to keep a particular purpose for each component and avoid bloating and overcomplicating them with features that don't fit them. Happy to hear other opinions from @open-telemetry/collector-contrib-approvers

@shivanshuraj1333
Copy link
Member

This one is a pretty straightforward separation of concerns. Parsing of an attribute value has nothing to do with k8s. So it should not be part of this processor. It was added a long time ago as extra functionality before we had any option to parse attribute values.

I agree.
wrt to performance concerns I'm not sure how huge the impact would be, in production users must be using transform processor and it does make sense to deprecate FieldExtractConfig.KeyRegex in lieu of transform processor.

@shivanshuraj1333
Copy link
Member

if the proposal gets accepted, can pick up the work

@TylerHelmuth
Copy link
Member

It may become less performant after this change, but I believe it's more important to keep a particular purpose for each component and avoid bloating and overcomplicating them with features that don't fit them.

I agree. Holding to separation of concerns is a good idea. As for performance, we can benchmark it, but the transformprocessor is quite performant.

We should not remove this feature from the processor, though, until the equivalent feature exists in the transformprocessor. It cannot currently extract new attributes using regex, we need to add the functionality via a new OTTL function.

@dmitryax
Copy link
Member Author

@TylerHelmuth replace_pattern should work already, I believe. Let me know if I miss something

k8sattributes:
  extract:
    annotations:
      - tag_name: git.sha
        key: kubernetes.io/change-cause
        regex: GIT_SHA=(?P<value>\w+)

should be replaceable with

k8sattributes:
  extract:
    annotations:
      - tag_name: git.sha
        key: kubernetes.io/change-cause
transform:
  metric_statements:
    - context: resource
      statements: 
        - delete(attributes["git.sha"]) where attributes["git.sha"] != nil and not IsMatch(attributes["git.sha"], "GIT_SHA=\w+")
        - replace_pattern(attributes["git.sha"], "GIT_SHA=(\w+)", "$1") where attributes["git.sha"] != nil

But if we had extract func, it'd be cleaner:

k8sattributes:
  extract:
    annotations:
      - tag_name: k8s.change_cause
        key: kubernetes.io/change-cause
transform:
  metric_statements:
    - context: resource
      statements: 
        - extract(attributes, attributes["k8s.change_cause"], "GIT_SHA=(?P<git.sha>\w+)") where attributes["k8s.change_cause"] != nil
        - delete(attributes["k8s.change_cause"]) where attributes["k8s.change_cause"] != nil

Is this what you're thinking of?

@TylerHelmuth
Copy link
Member

@dmitryax I forgot replace_pattern could use $1. You can remove a couple extra nil checks:

k8sattributes:
  extract:
    annotations:
      - tag_name: git.sha
        key: kubernetes.io/change-cause
transform:
  metric_statements:
    - context: resource
      statements:
        - delete(attributes["git.sha"]) where not IsMatch(attributes["git.sha"], "GIT_SHA=\w+")
        - replace_pattern(attributes["git.sha"], "GIT_SHA=(\w+)", "$1")

I see why it is so convenient to do the extraction within the k8sattributesprocessor - if you leave it for the transformprocessor then you have to explicitly check whether or not the attribute should be deleted.

If we add an extract function we could add an optional param to help with this:

k8sattributes:
  extract:
    annotations:
      - tag_name: k8s.change_cause
        key: kubernetes.io/change-cause
transform:
  metric_statements:
    - context: resource
      statements: 
        - extract(attributes, attributes["k8s.change_cause"], "GIT_SHA=(?P<git.sha>\w+)", deleteOriginal=true)

@dmitryax
Copy link
Member Author

@TylerHelmuth LGTM. We should introduce that function before working on this issue to have simpler migration guidelines. Please see #25834 and edit it if necessary

@dmitryax dmitryax changed the title [processors/k8sattributes] Deprecate and remove FieldExtractConfig.Regex config option [processor/k8sattributes] Deprecate and remove FieldExtractConfig.Regex config option Aug 16, 2023
@TylerHelmuth
Copy link
Member

@dmitryax after working on #24599 for a bit I realized the existing extraction logic allows grabbing all the annotations/labels, which is really useful for users who want to grab everything. I believe the options we've discussed so far will remove that capability. I don't want to lose that feature, so if we move forward with removing key_regex then I think we should implement an "everything" configuration option like the one proposed in #24599.

@dmitryax
Copy link
Member Author

We don't lose that feature with this proposal. I'm not asking to remove key_regex, I'm asking to remove regex which is applied on labels/annotation values not keys

@dmitryax
Copy link
Member Author

dmitryax commented Aug 18, 2023

I just found and fixed a typo in the issue description which likely brought the confusion

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Oct 17, 2023
@dmitryax dmitryax removed the Stale label Oct 17, 2023
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 18, 2023
@TylerHelmuth TylerHelmuth added never stale Issues marked with this label will be never staled and automatically removed and removed Stale labels Dec 18, 2023
@odubajDT
Copy link
Contributor

odubajDT commented Jun 6, 2024

Hi, I would like to work on this issue cc @evan-bradley

@TylerHelmuth

This comment was marked as resolved.

evan-bradley added a commit that referenced this issue Jul 29, 2024
…ption (#33411)

**Link to tracking Issue:** #25128

---------

Signed-off-by: odubajDT <ondrej.dubaj@dynatrace.com>
Co-authored-by: Curtis Robert <crobert@splunk.com>
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
@TylerHelmuth
Copy link
Member

The above comment was wrong as the scenario wasn't affected by the removal of the regex field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers never stale Issues marked with this label will be never staled and automatically removed priority:p2 Medium processor/k8sattributes k8s Attributes processor
Projects
None yet
Development

No branches or pull requests

6 participants