Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Elastic Common Schema in OpenTelemetry #199

Closed
wants to merge 16 commits into from

Conversation

cyrille-leclerc
Copy link
Member

@cyrille-leclerc cyrille-leclerc commented Mar 15, 2022

This OTEP proposes to add support for Elastic Common Schema in OpenTelemetry, enriching the Otel Semantic Conventions.
It has been prepared by @cyrille-leclerc , @alolita , @kumoroku, @jkowall, @danielkhan, and many others

Relates to:

@cyrille-leclerc cyrille-leclerc requested a review from a team March 15, 2022 12:43
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me what this OTEP is actually proposing. It says "to support ECS", but what does it mean in practice? Is it proposing that OTLP data model be changed to match ECS? Is it proposing a mapping of OTEL semantic conventions to/from ECS?

@cyrille-leclerc
Copy link
Member Author

It is not clear to me what this OTEP is actually proposing. It says "to support ECS", but what does it mean in practice? Is it proposing that OTLP data model be changed to match ECS? Is it proposing a mapping of OTEL semantic conventions to/from ECS?

@yurishkuro thanks for identifying this lack of clarity. The proposal is to enrich existing OpenTelemetry Semantic Conventions with the fields defined in the Elastic Common Schema.
We are aware that there will be some overlaps that will have to be resolved favoring the backward compatibility of the OpenTelemetry Semantic Conventions, usually preferring the existing Otel Semantic Convention attribute unless a good reason would justify to prefer the ECS field naming.

Does this clarify the proposal?

@yurishkuro
Copy link
Member

This clarifies the intent, but not the proposal. The OTEP itself does not provide said mapping, so if I were to approve the OTEP, what am I approving, the intent to create another OTEP that will contain the actual mapping?

@cyrille-leclerc
Copy link
Member Author

cyrille-leclerc commented Mar 15, 2022

This clarifies the intent, but not the proposal. The OTEP itself does not provide said mapping, so if I were to approve the OTEP, what am I approving, the intent to create another OTEP that will contain the actual mapping?

You are correct @yurishkuro . If this OTEP is accepted, then we plan to:

  • Define the mechanism to integrate ECS fields in Otel Semantic Conventions (there are 40+ namespaces in ECS) and then we will
  • Integrate the ECS fields according to the defined mechanism

Does it make sense?

@yurishkuro
Copy link
Member

Ok. I don't have objection to such sequencing, but please make it explicit in the OTEP that this is the action plan.

On a related note, what's preventing you from doing the mapping in the same OTEP? Are you concerned that it may not be accepted after doing all that work?

And one more thing: it would be good to add a diagram (eg using mermaid) showing the proposed data flow, i.e. what is the direction of transformations, where the data comes from and goes into.

@cyrille-leclerc
Copy link
Member Author

Ok. I don't have objection to such sequencing, but please make it explicit in the OTEP that this is the action plan.

Thanks, I'll do it asap.

On a related note, what's preventing you from doing the mapping in the same OTEP? Are you concerned that it may not be accepted after doing all that work?

Correct, it's the quantity of work.

And one more thing: it would be good to add a diagram (eg using mermaid) showing the proposed data flow, i.e. what is the direction of transformations, where the data comes from and goes into.

I'm not sure to catch. We thought of enriching the Otel Semantic Conventions with new attributes and we didn't identify changes in the data flow.
We envisioned for example authors of OTel Collector Receivers to leverage these new attributes to structure more the log files they parse but this would not change the fact that people would create Otel Collector Receivers to parse log files.
IS it something we should clarify in the OTEP?

@yurishkuro
Copy link
Member

We thought of enriching the Otel Semantic Conventions with new attributes and we didn't identify changes in the data flow.
We envisioned for example authors of OTel Collector Receivers to leverage these new attributes to structure more the log files they parse but this would not change the fact that people would create Otel Collector Receivers to parse log files.
IS it something we should clarify in the OTEP?

Most semantic conventions are used to capture data via instrumentation, some are for converting from existing formats. I am not familiar enough with ECS, is that actually an established interchange format or just for data at rest in ES? Would you have log files written with ECS-formatted data?

Basically, my point about the diagram: if you're defining the mapping, then paint a picture where that mapping might be used.

@jkowall
Copy link

jkowall commented Mar 16, 2022

@yurishkuro the purpose is to be able to correlate data from different log sources which are not OpenTelemetry instrumented data sources. Think of existing technologies like routers, switches, host operating systems, DNS logs, app server logs, and so forth. Hopefully, over time some of these will switch to an Otel format with the proper schema, but as you are aware, this can take decades to change. Meanwhile, we need to correlate data between these togs and that's why this is important for users to have.

@cyrille-leclerc
Copy link
Member Author

cyrille-leclerc commented Mar 16, 2022

@yurishkuro FYI I clarified

  • The process we proposal to merge ECS fields in OpenTelemetry Semantic Conventions here
  • How would OpenTelemetry users practically use the new OpenTelemetry Semantic Conventions Attributes brought by ECS here. I'll iterate on this example to provide a diagram.

cyrille-leclerc and others added 2 commits March 31, 2022 16:32
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
@weyert
Copy link

weyert commented Jun 14, 2022

It's not fully clear what's the benefit of this for a Opentelemetry standard consumer is when you are not using Elastic? Is the ECS format commonly supported by other logging vendors? I can't really find many references to it

For example, how does this help me when I am using Cloud Logging or Loggly?

@jkowall
Copy link

jkowall commented Jun 15, 2022

It's not fully clear what's the benefit of this for a Opentelemetry standard consumer is when you are not using Elastic? Is the ECS format commonly supported by other logging vendors? I can't really find many references to it

For example, how does this help me when I am using Cloud Logging or Loggly?

Not every logging system has parsing and schemas, but the good ones typically do. Even if it does, it makes sense to normalize the data before you send it to the system. If your logging system doesn't support schemas, you MUST map the data before you store it.

The reason is, so you can correlate the data from various sources. For example, if I am capturing logs from a Palo Alto Firewall which calls source ip something, and I'm capturing ipfw logs from a Linux host which calls source ip something different in the log data. How do I query these consistently?

If you are only capturing logs from custom software using Otel Logging then you will not have this issue, but unfortunately, we get logs from many sources, and we cannot easily correlate the data.

@Mpdreamz
Copy link

@jkowall @arminru and others on this thread 👋, I'm supporting @cyrille-leclerc's efforts from Elastic's side.

Is there anything still blocking merging this PR? Very keen to hear if there are still open ends that need clarification on our end!

@jkowall
Copy link

jkowall commented Aug 24, 2022

I don't think so just need reviews from others to permit merging. I still see the need for this regularly with user discussions.

@Mpdreamz
Copy link

Thanks @jkowall 👍! as discussed with @tigrannajaryan today during the Logs SIG meeting the next step for us would be to open it up to the wider group in the Specification SIG to get a stronger consensus around the intend behind this OTEP.

That would open us up to focus on the mechanics moving forward too.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very supportive of this effort.
There are some fundamental issues that we need to address as part of open-telemetry/opentelemetry-specification#2753, aligning ECS and OpenTelemetry semantic convention is the right direction (which will benefit the industry and both ecosystems) IMHO.

@reyang
Copy link
Member

reyang commented Sep 23, 2022

@jsuereth I want to get your opinion here.

Specifically, I wonder if we should continue to take new semantic conventions PRs if these are already covered by ECS (e.g. open-telemetry/opentelemetry-specification#2824 process/system uptime seems to be a common thing, if we envision that ECS and OpenTelemetry semantic convention would align AND ECS has already covered it, should we stop this PR and set the expectation, or we continue to let these PRs in, and create more work to smooth them out?)

@astencel-sumo FYI


## Introduction

This proposal is to add support for the Elastic Common Schema (ECS) in the OpenTelemetry specification and provide full interoperability for ECS in OpenTelemetry component implementations. We propose to implement this support by enriching OpenTelemetry Semantic Conventions with ECS fields. The goal is to merge ECS into OTel Semantic Conventions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to merge ECS into OTel Semantic Conventions.

What do you mean by "merge"? Is the ECS project going to be retired once the "merge" is accomplished?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro There has been some confusion around this - we'll be updating the Otep to clarify the intent. Our goal is partner with the Otel community, to facilitate the adoption of ECS fields within Otel Semantic Conventions. Otel can decide which fields are appropriate to adopt or exclude. ECS will continue to be maintained by Elastic and heavily used within Kibana. However, we'd be happy for Otel members to engage in our RFC process to ensure ECS and Semantic Conventions are as aligned as possible going forward.

@ruflin
Copy link

ruflin commented Sep 26, 2022

@reyang It would be great to have these fields in ECS / single place. It was always the goal to get metrics into ECS especially some of these fundamentals. See elastic/ecs#474 (comment) for more discussions.

We propose 3 steps to add support for ECS in OpenTelemetry Semantic Conventions:

1. Validation of the principle of adding support for ECS in OpenTelemetry and validation that this support would be implemented by merging ECS fields in OpenTelemetry Semantic Conventions,
2. Validation of the methodology to merge these ECS fields. As there are 40+ ECS namespaces and as there will be few overlaps and maybe needs to evolve some ECS field names to match the vocabulary and conventions of OTel, we have in mind an iterative process tackling namespaces one after the other. We are also interested in clarifying how downstream schemas could be created; We have for example identified the value of having downstream schemas to specify persistence characteristics (see ECS string persistence types <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/text.html#match-only-text-field-type">match_only_text</a>, <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/keyword.html#keyword-field-type">keyword</a> <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/keyword.html#constant-keyword-field-type">constant_keyword</a>, <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/keyword.html#wildcard-field-type">wildcard</a>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "downstream schemas"?

Copy link
Member Author

@cyrille-leclerc cyrille-leclerc Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We said "downstream schema" by analogy with "downstream distributions".
The principle is that some vendors will be interested in extending the schema provided by the OpenTelemetry Semantic Conventions for reasons like:

  • Adding metadata to OTel Semantic Attributes to specify the storage strategy (e.g. specifying the indexation strategy when persisting string attributes)
  • Adding Attributes in addition to the existing OTel Semantic Attributes. For example:
    • It's common for large orgs who adopt Elastic Common Schema to extend it with their enterprise fields.
    • Datadog uses the concept of "Standard Attributes" to enable organizations to extend Datadog's built in attributes with their specific needs

Shall we use a different naming or clarify this section?

@tigrannajaryan
Copy link
Member

I think there is a great deal of value that we all can derive from this initiative. That said, I think there are a few things that the OTEP needs to address but doesn’t.

  1. It needs to make clear that "merging" does not result in one final set of semantic conventions that both OpenTelemetry and ECS use. "Merging" means adding new conventions to OpenTelemetry by borrowing their definitions from ECS.

  2. As a results of "merging" we will end up with OpenTelemetry semantic conventions that as a whole will still be different from ECS. OpenTelemetry semantic conventions and ECS will share a (possibly large) common subset, but they won’t be exactly the same as a whole.

  3. After the "merging" is complete OpenTelemetry semantic conventions and ECS will not be cemented. They will continue to evolve. The OTEP does not say whether this evolution will be done in any sort of synchronized manner or we should expect that the OpenTelemetry semantic conventions and ECS will gradually drift further apart over time.

  4. The OTEP does not address the topic of co-existence. Will it somehow enable the following scenario: ECS-compatible data sources to send data to OpenTelemetry-compatible backends (and vice versa)? Do we expect an ability for (unambiguous?) runtime transformation from ECS to OpenTelemetry (and vice versa) that can be done for example in the OpenTelemetry Collector processor or elsewhere in the collection pipeline or at query time in the backend? Unsurprisingly, this looks a lot like the problem that Telemetry Schemas solve for different versions of Schemas and solutions may look similar as well. It is unclear if this is considered at all.

Since this is more of a vision OTEP that is expected to be followed by a more specific "how do we do that" proposal I don’t expect this OTEP to necessarily have all detailed answers to these questions, but it needs to at least clarify that these are concerns that need to be addressed (the OTEP touches these tangentially in the very last paragraph but it is not very explicitly articulated).

I also generally feel that the "How would Otel users practically use" section is very sparse and would benefit from being more elaborate.

@reyang
Copy link
Member

reyang commented Sep 29, 2022

3. After the "merging" is complete OpenTelemetry semantic conventions and ECS will not be cemented. They will continue to evolve. The OTEP does not say whether this evolution will be done in any sort of synchronized manner or we should expect that the OpenTelemetry semantic conventions and ECS will gradually drift further apart over time.

Good point @tigrannajaryan. I think if this ended up with OpenTelemetry semantic conventions and ECS evolving independently after the initial "merge", it is kind of defeating the purpose here.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 18, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

@Mpdreamz
Copy link

@tigrannajaryan thanks for the feedback!

I updated the Proposed process to contribute ECS to OpenTelemetry Semantic Conventions section to include more details on the contribution and what coexistence could look like. It now also highlights in stronger terms that this is in fact a contribution and not a merger.

We still feel there is a massive benefit to closing the gap and ensure each other's success through aligning the two specifications closer.

@cyrille-leclerc
Copy link
Member Author

cyrille-leclerc commented Nov 21, 2022

As I am no longer part of Elastic and I am no longer connected to decisions on Elastic Common Schema (ECS), would it be better if I closed this PR and let @jamiehynds, @Mpdreamz, @AlexanderWert and @ruflin create a new PR?

@cyrille-leclerc
Copy link
Member Author

I'm closing this PR to prevent misunderstandings as I'm no longer working with Elastic. I'll let @jamiehynds, @Mpdreamz, @AlexanderWert and @ruflin progress on this topic the way they want.

@AlexanderWert
Copy link
Member

A new PR has been created for this proposal in #222.
We would appreciate the discussion to continue there and see the approvals from this PR to be "transferred" to #222.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.