feat(processors.parser): Add option to parse tags #11228

Hipska · 2022-05-31T12:42:49Z

Required for all PRs:

Updated associated README.md.
Wrote appropriate unit tests.
Pull request title or commits are in conventional commit format

Add an option to also parse tags (as the example config description was already saying that).

I don't have added tests yet. If someone could tell me the test(name)s I should add, I'm happy to implement them..

This is a revive of #11003 since that one was broken..

srebhan · 2022-06-10T14:57:25Z

@reimda any comment here? Code looks good so far.

reimda · 2022-06-10T22:30:22Z

Hi Thomas, could you tell me more about your use case? It still seems to me like the converter plugin already provides this functionality in a simple way. I think it's only a line or two more configuration than what you propose in this PR, and it requires no additional code.

Sorry I haven't answered your questions from the previous PR (#11003 (comment)) until now:

but the sample config already mentioned tags

Handling tags actually was part of the original PR for this plugin (#4551). It was removed intentionally because the converter plugin made it redundant. Here are the commits where it was removed:

remove tags tests - 38733c1
remove tags functionality - c736252

The sample config mentions "field/tag(s)" only because it wasn't removed along with related stuff in those two commits. Ever since the plugin was merged, the mention of tags in the docs has been a typo. It's not a good reason to add it back.

I think it was right to remove it and encourage people to use converter. Having processors act on either tags or fields but not both reduces duplicated code and tests. If every plugin has to handle both tags and fields, very similar code and tests are duplicated in every plugin. Everywhere you act on one you have to act on the other, doubling that kind of code in every plugin. More code means more potential for bugs and slower unit tests.

If we have plugins handle only tags or only fields, and then we rely on the converter to switch from one to the other in unusual use cases, we don't have that doubling of tag/field handling code in the plugins.

Other processors also work on both tags and fields

Some do, but most don't. Most of them act on either tag or field depending on the main use case. If someone is using a plugin in an unintended way (not the main use case) they are expected to use converter.

Bold ones in the table below act on either field or tag but not both, italic ones act on both.

name	input source	output destination
aws	tag	tag
clone	(none)	(none)
converter	tag, field	tag, field, metric name
date	(none)	tag, field
dedup	(none)	(none)
defaults	field	field
enum	tag or field	same as input
execd	any	any
filepath	tag, field	same as input
ifname	tag	tag
noise	field	field
override	(none)	tag
parser	field	field
pivot	tag	field
port_name	tag or field	same as input
printer	(none)	(none)
regex	tag or field	same as input
rename	tag, field, metric name	same as input
reverse_dns	tag, field	same as input
s2geo	fields	tag
starlark	any	any
strings	tag, field, metric name	same as input
tag_limit	tag	tag (removes)
template	tag, field, metric name	tag
topk	series (including values)	add tag
unpivot	field	tag

I'm not completely opposed to having a processor act on both fields and tags if there's a good reason for it. Most of the italic ones in the table make sense to me. For example, if you want to rename a tag, it would be crazy to have to convert it to a field, rename the field, then convert it back to a tag. It makes sense for the rename processor to act on both tags and fields.

Parsing a tag value doesn't make the same sense to me as renaming a tag. Usually data that's complex enough to be parsed is held in fields, so that's the main use case. In your case it must not be this way. If you could describe your use case and why you have data in a tag that needs to be parsed, maybe it would clear up for me why it's not acceptable to use the converter processor.

Hipska · 2022-06-13T11:13:44Z

So this is my use case:

[[processors.strings]]
  order = 1
  namepass = ["SGSN-MME_*"]

  [[processors.strings.replace]]
    tag = "MeasObjLdn"
    old = ","
    new = " "


[[processors.converter]]
  order = 2
  namepass = ["SGSN-MME_*"]

  ## Tags to convert
  [processors.converter.tags]
    string = ["MeasObjLdn"]


# Parse a value in a specified field/tag(s) and add the result in a new metric
[[processors.parser]]
  order = 3
  namepass = ["SGSN-MME_*"]

  ## The name of the fields whose value will be parsed.
  parse_fields = ["MeasObjLdn"]

  ## If true, incoming metrics are not emitted.
  # drop_original = false

  ## If set to override, emitted metrics will be merged by overriding the
  ## original metric using the newly parsed metrics.
  merge = "override"

  ## The dataformat to be read from files
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "logfmt"

  ## Array of key names which should be collected as tags. Globs accepted.
  logfmt_tag_keys = ["*"]


# Apply metric modifications using override semantics.
[[processors.override]]
  order = 4
  namepass = ["SGSN-MME_*"]

  ## Fields to drop
  fielddrop = ["MeasObjLdn"]

The MeasObjLdn contains values like:

ManagedElement=<hostname>,SgsnMme=1,RouterInstance=<foobar>
ManagedElement=<hostname>,SgsnMme=1,IpInterface=ETH_1_25_1_1920

So they are all identifying and thus a tag but not a field. Wit this PR, I could eliminate processor 2 and 4 (as my output does not support string fields).

Hipska · 2022-09-07T16:07:14Z

@reimda WDYT?

reimda · 2022-09-08T21:32:14Z

Hi Thomas. Where does the MeasObjLdn tag come from? The use case you shared only includes the processors, not inputs. When I googled the term I got hits in a digital cellular XML schema (ETSI TS 132 435 V11) which sounds related to your work.

If it's coming from parsed xml, could you get rid of parser "order = 2" by having the xml parser save it as a field instead of a tag? Maybe that isn't an option due to other constraints?

I know this specific case is useful to you so I'm not going to refuse to merge the PR. The code and docs changes look good to me. It still does seem to me like merging this PR is a small step towards making all processors work on both tags and fields which is not the most maintainable and usable.

Hipska · 2022-09-09T07:19:44Z

You are right, it indeed comes from parsing specific xml files. I could indeed set it as a field, but not knowing on beforehand the name of the other fields, there could be a duplicate naming where one or the other gets overwritten. (I know, bad practice, but Murphy's law is not helping)

So we're fine with this after adding tests?

reimda · 2022-09-09T14:05:39Z

Yes, with tests added I'll merge it.

Hipska · 2022-09-12T14:40:37Z

Done, hope I made it for the release :)

telegraf-tiger · 2022-09-12T15:00:56Z

Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

👍 This pull request doesn't change the Telegraf binary size

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB	RPM	TAR GZ	ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	darwin_arm64.tar.gz	windows_i386.zip
armel.deb	armv6hl.rpm	freebsd_amd64.tar.gz
armhf.deb	i386.rpm	freebsd_armv7.tar.gz
i386.deb	ppc64le.rpm	freebsd_i386.tar.gz
mips.deb	riscv64.rpm	linux_amd64.tar.gz
mipsel.deb	s390x.rpm	linux_arm64.tar.gz
ppc64el.deb	x86_64.rpm	linux_armel.tar.gz
riscv64.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_riscv64.tar.gz
		linux_s390x.tar.gz
		static_linux_amd64.tar.gz

reimda · 2022-09-12T15:36:58Z

1.24.0 was tagged about a half hour ago so this didn't make it. Sorry I didn't see it was ready.

Hipska · 2022-09-13T07:56:19Z

Ah bummer, will then go into 1.25 or 1.24.1?

reimda · 2022-09-14T15:24:13Z

Ah bummer, will then go into 1.25 or 1.24.1?

Yes. 1.24.1 is scheduled for October 3

(cherry picked from commit d976158)

Hipska · 2022-09-20T07:07:32Z

Well it's there sooner than expected 😉

Hipska added 2 commits April 20, 2022 12:15

feat(processors/parser): Add option to parse tags

47d2149

Merge branch 'master' into feature/processors/parser/tags

695d6ee

telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label May 31, 2022

Hipska mentioned this pull request May 31, 2022

feat(processors/parser): Add option to parse tags #11003

Closed

3 tasks

Hipska added the plugin/processor label May 31, 2022

chore(processors/parser): Cleanup wrong merge state

c737d6a

Hipska requested review from reimda, sspaink and srebhan June 1, 2022 15:28

chore(processors/parser): Remove obsolete test method

dc7ed3c

Hipska changed the title ~~feat(processors/parser): Add option to parse tags~~ feat(processors.parser): Add option to parse tags Sep 12, 2022

Hipska added 4 commits September 12, 2022 15:54

Merge branch 'master' into feature/processors/parser/tags

ac465bf

fix(processors/parser): Copy name for empty metrics

e75101a

test(processors/parser): Add tests

5839f44

docs(processors/parser): Satisfy stupid line-length linter

55c976a

reimda approved these changes Sep 12, 2022

View reviewed changes

reimda added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Sep 12, 2022

reimda merged commit d976158 into influxdata:master Sep 12, 2022

Hipska deleted the feature/processors/parser/tags branch September 13, 2022 07:55

reimda pushed a commit that referenced this pull request Sep 19, 2022

feat(processors.parser): Add option to parse tags (#11228)

7cf14e6

(cherry picked from commit d976158)

dba-leshop pushed a commit to dba-leshop/telegraf that referenced this pull request Oct 30, 2022

feat(processors.parser): Add option to parse tags (influxdata#11228)

46e1b77

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(processors.parser): Add option to parse tags #11228

feat(processors.parser): Add option to parse tags #11228

Hipska commented May 31, 2022 •

edited

Loading

srebhan commented Jun 10, 2022

reimda commented Jun 10, 2022

Hipska commented Jun 13, 2022

Hipska commented Sep 7, 2022

reimda commented Sep 8, 2022 •

edited

Loading

Hipska commented Sep 9, 2022

reimda commented Sep 9, 2022

Hipska commented Sep 12, 2022

telegraf-tiger bot commented Sep 12, 2022

Artifact URLs

reimda commented Sep 12, 2022

Hipska commented Sep 13, 2022

reimda commented Sep 14, 2022

Hipska commented Sep 20, 2022

feat(processors.parser): Add option to parse tags #11228

feat(processors.parser): Add option to parse tags #11228

Conversation

Hipska commented May 31, 2022 • edited Loading

Required for all PRs:

srebhan commented Jun 10, 2022

reimda commented Jun 10, 2022

Hipska commented Jun 13, 2022

Hipska commented Sep 7, 2022

reimda commented Sep 8, 2022 • edited Loading

Hipska commented Sep 9, 2022

reimda commented Sep 9, 2022

Hipska commented Sep 12, 2022

telegraf-tiger bot commented Sep 12, 2022

Artifact URLs

reimda commented Sep 12, 2022

Hipska commented Sep 13, 2022

reimda commented Sep 14, 2022

Hipska commented Sep 20, 2022

Hipska commented May 31, 2022 •

edited

Loading

reimda commented Sep 8, 2022 •

edited

Loading