Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: Coralogix Processor #33090

Open
2 tasks done
galrose opened this issue May 16, 2024 · 6 comments
Open
2 tasks done

New component: Coralogix Processor #33090

galrose opened this issue May 16, 2024 · 6 comments
Labels
Accepted Component New component has been sponsored

Comments

@galrose
Copy link
Contributor

galrose commented May 16, 2024

The purpose and use-cases of the new component

Coralogix processor is for clients using Coralogix.
The processor will have multiple features but the first one is, to template db.statements by removing the variables and replacing them with ?. Which will add a new tag to the span called db.statement.blueprint.id and db.statement.blueprint. These will be used internally in Coralogix to be able to recognize which queries are of the same template.

At the start we expect it to work only for postgresql, mysql, and sqlserver.
The processor will check the db.system and only if its in the recognized systems it will try to parse the query.

There is an option to work with sampling, and if so it will only add the sampling.priority key to db.statement.blueprints it has not seen before (using an internal cache), then it is possible to use the probabilistic sampler to only send new spans

Example configuration for the component

basic setup

processors:
  coralogix:
    db_statement_blueprints:
      with_sampling: true

with cache

processors:
  coralogix:
    db_statement_blueprints:
      with_sampling: true
      cache_config:
        max_cache_size_bytes: 1073741824 #1GB
        max_cached_entries: 10000000

Telemetry data types supported

traces

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

No response

Sponsor (optional)

No response

Additional context

No response

@galrose galrose added needs triage New item requiring triage Sponsor Needed New component seeking sponsor labels May 16, 2024
@crobert-1
Copy link
Member

Hello @galrose, since this is a vendor-specific component, I'll sponsor as I'm next in the list of rotating sponsors.

A couple of comments/questions.

  1. Config options are required to be snake case, so can you update dbStatementBlueprints to be db_statement_blueprints?
  2. I see the proposed telemetry type is for traces, but the mentioned SQL databases are all metric receivers, and don't support traces. I assume I'm missing something here, but what's the expectation as far as getting traces from these different databases?
  3. Can you share more information on your goal with sampling, and the cache config options? I'm not sure I entirely follow.

@crobert-1 crobert-1 added Accepted Component New component has been sponsored and removed Sponsor Needed New component seeking sponsor needs triage New item requiring triage labels May 16, 2024
@galrose
Copy link
Contributor Author

galrose commented May 19, 2024

Hey @crobert-1 thanks for sponsoring the component 😄

  1. changed to snake_case

  2. The component will go over the client spans the services will create and template the db.statement attribute. I don't expect to get spans from the database itself

  3. Our goal with sampling is for clients to be able to send span metrics without sending their spans, but for our Database Catalog we want to be able to show the client's an example of the query itself and the template that will be created.
    So we need at least 1 span that contains a db.statement template that we haven't seen before to be able to display the client's db.statement example.

The cache config options are the ristero cache config options

		sp.cache, err = ristretto.NewCache(&ristretto.Config{
			NumCounters: numCounters, // number of keys to track frequency of.
			MaxCost:     cacheSize,   // maximum cost of cache.
			BufferItems: bufferItems, // number of keys per Get buffer.
		})

@crobert-1
Copy link
Member

Bit of a side note, but I'd suggest de-coupling the configuration options from the specific golang cache as much as possible, to make it simpler to change cache package if necessary. On the same note, it looks like the ristretto package hasn't had any new release for 1.5 years now, and I'm seeing some PRs that have been open for a long time without review. I have no experience with this package, but just thought I should call it out.

Otherwise, you're welcome to start submitting PRs! Please follow the guide for adding new components, in terms of contents for each PR. Looking forward to making progress on this!

@galrose
Copy link
Contributor Author

galrose commented May 22, 2024

That makes sense I'll update the configs in this issue accordingly.
Thanks for the help :) I'll open the first PR from the guide soon

Copy link
Contributor

github-actions bot commented Aug 2, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Aug 2, 2024
@crobert-1 crobert-1 removed the Stale label Aug 2, 2024
MovieStoreGuy pushed a commit that referenced this issue Aug 9, 2024
**Description:** <Describe what has changed.>
Adding a feature - Adding a feature to create templates (blueprints)
from sql queries.
Currently specifically for postgresql and mysql queries.

**Link to tracking Issue:** #33090

**Testing:** <Describe what testing was performed and which tests were
added.>
currently no tests, will be added in next PR

**Documentation:** <Describe the documentation added.>
Added documentation for possible configuration and the usecase of the
processor

---------

Co-authored-by: Curtis Robert <crobert@splunk.com>
Co-authored-by: Antoine Toulme <atoulme@splunk.com>
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
**Description:** <Describe what has changed.>
Adding a feature - Adding a feature to create templates (blueprints)
from sql queries.
Currently specifically for postgresql and mysql queries.

**Link to tracking Issue:** open-telemetry#33090

**Testing:** <Describe what testing was performed and which tests were
added.>
currently no tests, will be added in next PR

**Documentation:** <Describe the documentation added.>
Added documentation for possible configuration and the usecase of the
processor

---------

Co-authored-by: Curtis Robert <crobert@splunk.com>
Co-authored-by: Antoine Toulme <atoulme@splunk.com>
Copy link
Contributor

github-actions bot commented Oct 2, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Oct 2, 2024
@crobert-1 crobert-1 removed the Stale label Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Component New component has been sponsored
Projects
None yet
Development

No branches or pull requests

2 participants