Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: Sum Connector #32669

Open
2 tasks
greatestusername opened this issue Apr 24, 2024 · 11 comments
Open
2 tasks

New component: Sum Connector #32669

greatestusername opened this issue Apr 24, 2024 · 11 comments
Labels
Accepted Component New component has been sponsored

Comments

@greatestusername
Copy link
Contributor

greatestusername commented Apr 24, 2024

The purpose and use-cases of the new component

Sum Connector takes in logs, metrics, or traces and matches an attribute then allows summing numerical values present in that attribute and sending those sums as a time series metric along with any attributes defined within the connector config.

E.G. Log contains a field for total_price of a cart. Matching on total_price will take numerical values from this field and emit a time series metric of the sums along with any other attributes defined in the connector config.

This will function similarly to -- and be patterned after -- the current countconnector but emitting sums of values rather than counts of occurrences.

Example configuration for the component

receivers:
  foo:
connectors:
  sum:
    logs:
      field.to.metric.sum:
        conditions:
          - attributes["boop"] != "NULL"
        attributes:
          - key: boop
            default_value: unspecified_level
          - key: env
            default_value: no_env
exporters:
  bar:

service:
  pipelines:
    metrics/sum:
       receivers: [sum]
       exporters: [bar]
    logs:
       receivers: [foo]
       exporters: [sum]

Telemetry data types supported

In: Traces, metrics, logs
Out: Metrics

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

greatestusername, shalper2

Sponsor (optional)

@crobert-1

Additional context

No response

@greatestusername greatestusername added needs triage New item requiring triage Sponsor Needed New component seeking sponsor labels Apr 24, 2024
@crobert-1
Copy link
Member

Hello @greatestusername, can you share a bit more details on the proposed configuration? I'm not sure I follow the proposed options and how it would impact the resulting output.

@greatestusername
Copy link
Contributor Author

Thanks @crobert-1
I'm pasting a slightly modified config to better illustrate. Imagine I'm sending logs from a checkout service through the OTel collector.
I want to use the total.payment attribute from those logs to output a sum of total.payments as a time series metric called checkout.total.
I also want to include attributes on that metrics for payment.processor that will default to a value of unspecified_processor if no payment.processor attribute value is available. Similarly I want to include an env attribute on the metric.

Let me know if you need further clarification? Thanks!

receivers:
  foo:
connectors:
  sum:
    logs:
      checkout.total:
        conditions:
          - attributes["total.payment"] != "NULL"
        attributes:
          - key: payment.processor
            default_value: unspecified_processor
          - key: env
            default_value: no_env
exporters:
  bar:

service:
  pipelines:
    metrics/sum:
       receivers: [sum]
       exporters: [bar]
    logs:
       receivers: [foo]
       exporters: [sum]

@crobert-1
Copy link
Member

Thanks for sharing another example, that was helpful for me to better understand. A couple more questions:

  1. For your given example, shouldn't we also specify source_attribute or something like that? Maybe I'm missing something, but the given config doesn't seem to specify where the actual value is coming from, it just filters logs where total.payment is NULL.
  2. I assume the default configuration for this component would be no options specified, and then every attribute would be summed, is that accurate? Then if a single new metric is defined, do we only keep that, and drop the rest?

@crobert-1
Copy link
Member

I'll sponsor this component 👍

@crobert-1 crobert-1 added Accepted Component New component has been sponsored and removed Sponsor Needed New component seeking sponsor needs triage New item requiring triage labels May 14, 2024
@greatestusername
Copy link
Contributor Author

Thanks for sharing another example, that was helpful for me to better understand. A couple more questions:

  1. For your given example, shouldn't we also specify source_attribute or something like that? Maybe I'm missing something, but the given config doesn't seem to specify where the actual value is coming from, it just filters logs where total.payment is NULL.
  2. I assume the default configuration for this component would be no options specified, and then every attribute would be summed, is that accurate? Then if a single new metric is defined, do we only keep that, and drop the rest?

Thank you for offering to sponsor this!

Also thank you for pointing this out! Yes a source_attribute does make sense! I was initially using the conditions and looking for the field in the conditions. So yes I think a single source_attribute setting to define which attribute to look for the numerical value. Then any conditions are just conditions rather than where the value is found.

I think having source_attribute would also answer question 2? We'd only ever metricize the defined attribute then sum those values into a single metric.

Example config with source_attribute:

receivers:
  foo:
connectors:
  sum:
    logs:
      checkout.total:
        source_attribute: 
          - attributes["total.payment"]
        conditions:
          - attributes["total.payment"] != "NULL"
        attributes:
          - key: payment.processor
            default_value: unspecified_processor
          - key: env
            default_value: no_env
exporters:
  bar:

service:
  pipelines:
    metrics/sum:
       receivers: [sum]
       exporters: [bar]
    logs:
       receivers: [foo]
       exporters: [sum]

Does that make sense?

@crobert-1
Copy link
Member

Do you think there would be value in having some default behavior for a no configuration option? Something like:

connectors:
  sum:
    logs:

It would result in a lot of internal logic and decision making that we'd have to work through, but maybe it would be useful for someone?

@greatestusername
Copy link
Contributor Author

In that sort of case where would it be getting the numerical value to sum? All attributes with numerical values? I'm not sure how that would work exactly. But interested in hearing more on the idea!

@crobert-1
Copy link
Member

It might be some kind of check if a value can be converted to a number, and keep it if successful. It's not a requirement for this component by any means though, and may be more complicated than it's worth.

You're welcome to start submitting PRs, I think we've got a good outline here!

@greatestusername
Copy link
Contributor Author

Awesome! I'll get a wireframe PR up in the next couple days so I can get started on the meat. Thank you! And apologies for my tardy reply.

@greatestusername
Copy link
Contributor Author

greatestusername commented Jun 24, 2024

Edit: I've got a PR up that should work

I'm having some trouble getting tests for connectors to pass in the initial PR. I'm just trying to get a wireframe going but am hitting failures on tests like TestComponentLifecycle/logs_to_metrics and other telemetry generation checks should I just be stubbing these for now so the initial PR isn't so large?

mx-psi pushed a commit that referenced this issue Jul 3, 2024
**Description:** Initial wireframe PR for a new sum connector (as
described in #32669)
Provides bare minimum to start developing a new connector (config,
factory, integration test stubs for connector lifecycle, etc)

Sum Connector takes in logs, metrics, or traces and matches an attribute
then allows summing numerical values present in that attribute and
sending those sums as a time series metric along with any attributes
defined within the connector config.

E.G. Log contains a field for `total_price` of a payment. Matching on
`total_price` as the `source_attribute` will take numerical values from
this attributte and emit a time series metric (`checkout.total`) of the
sums along with any other attributes defined in the connector config.


Example config:
```
receivers:
  foo:
connectors:
  sum:
    logs:
      checkout.total:
        source_attribute: 
          - attributes["total_price"]
        conditions:
          - attributes["total_price"] != "NULL"
        attributes:
          - key: payment.processor
            default_value: unspecified_processor
          - key: env
            default_value: no_env
exporters:
  bar:

service:
  pipelines:
    metrics/sum:
       receivers: [sum]
       exporters: [bar]
    logs:
       receivers: [foo]
       exporters: [sum]
```      

**Link to tracking Issue:** 32669

**Testing:** Generated component and package tests

**Documentation:** README and issue #32669 provide current documentation
of projected future state.
codeboten pushed a commit that referenced this issue Jul 25, 2024
**Description:** Fleshes out the config more fully and provides tests
for the config and possible combinations of settings

**Link to tracking Issue:** #32669

**Testing:** Added test cases and testing for config

**Documentation:** No new documentation yet.
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Aug 26, 2024
@crobert-1 crobert-1 removed the Stale label Aug 26, 2024
mx-psi pushed a commit that referenced this issue Oct 7, 2024
…ata) (#35434)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Adds tests for metrics to metrics and logs to metrics
follows this pr:
#34797

This is a lot of lines but it is mostly repetition and uses the same
standard structure as the previously merged trace to metric tests.

I dont think this needs any changelog but let me know if it does for
some reason! 😸

**Link to tracking Issue:** #32669 

**Testing:** <Describe what testing was performed and which tests were
added.>
Adds testing for metrics to metrics and logs to metrics using same
methodology as previous PR

**Documentation:** No changes to docs. This is just tests.

---------

Co-authored-by: Curtis Robert <crobert@splunk.com>
Eromosele-SM pushed a commit to sematext/opentelemetry-collector-contrib that referenced this issue Oct 9, 2024
…ata) (open-telemetry#35434)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Adds tests for metrics to metrics and logs to metrics
follows this pr:
open-telemetry#34797

This is a lot of lines but it is mostly repetition and uses the same
standard structure as the previously merged trace to metric tests.

I dont think this needs any changelog but let me know if it does for
some reason! 😸

**Link to tracking Issue:** open-telemetry#32669 

**Testing:** <Describe what testing was performed and which tests were
added.>
Adds testing for metrics to metrics and logs to metrics using same
methodology as previous PR

**Documentation:** No changes to docs. This is just tests.

---------

Co-authored-by: Curtis Robert <crobert@splunk.com>
mx-psi pushed a commit that referenced this issue Oct 29, 2024
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
- This PR changes stability to `alpha`
- Updates README.md for more clarity and detailed config


<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
[#32669]

<!--Describe what testing was performed and which tests were added.-->
#### Testing
No additional testing needed

<!--Describe the documentation added.-->
#### Documentation
Adds clarification of config, additional detail to detailed config, and
notes about when and how to use which telemetry type.

<!--Please delete paragraphs that you did not use before submitting.-->

---------

Co-authored-by: jeremyh <jeremyh@splunk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Component New component has been sponsored
Projects
None yet
Development

No branches or pull requests

2 participants