Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/elasticsearch] Add mapping.mode: raw configuration option #29619

Merged

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Dec 1, 2023

Description:

This PR adds a new configuration option, mapping.mode: raw, to the Elasticsearch exporter. When set, the Elasticsearch exporter will not prefix log or span attributes with Attributes. when forming the Elasticsearch document field names for these fields. Additionally, the exporter will also not prefix span events with Events.* with forming the Elasticsearch document field names for these fields.

Link to tracking Issue: Resolves #26647

Testing:

Besides adding/updating relevant unit tests in this PR, I also tested the changes in this PR against a local Elasticsearch cluster, using the following collector configurations:

  1. Without the new mapping.mode: raw setting.

    receivers:
      tcplog:
        listen_address: "0.0.0.0:54545"
    
    processors:
      attributes:
        actions:
          - action: insert
            key: first_attribute
            value: one
          - action: insert
            key: second_attribute
            value: two
    
    exporters:
      debug:
        verbosity: detailed
      elasticsearch:
        endpoints: [ "https://localhost:9200" ]
        user: elastic
        password: XXXXXXXX
        logs_index: otel-logs
        tls:
          insecure_skip_verify: true
        flush:
          interval: 1s
    
    service:
      pipelines:
        logs:
          receivers: [tcplog]
          processors: [attributes]
          exporters: [debug,elasticsearch]

    Resulting document in Elasticsearch:

    {
      "_index": "otel-logs",
      "_id": "l1E5J4wBD9bb2EmZJuDR",
      "_score": 1,
      "_source": {
        "@timestamp": "1970-01-01T00:00:00.000000000Z",
        "Attributes": {
          "first_attribute": "one",
          "second_attribute": "two"
        },
        "Body": "bar",
        "Scope": {
          "name": "",
          "version": ""
        },
        "SeverityNumber": 0,
        "TraceFlags": 0
      }
    }
  2. With the new mapping.mode: raw setting.

    receivers:
      tcplog:
        listen_address: "0.0.0.0:54545"
    
    processors:
      attributes:
        actions:
          - action: insert
            key: first_attribute
            value: one
          - action: insert
            key: second_attribute
            value: two
    
    exporters:
      debug:
        verbosity: detailed
      elasticsearch:
        endpoints: [ "https://localhost:9200" ]
        user: elastic
        password: XXXXXXXX
        logs_index: otel-logs
        tls:
          insecure_skip_verify: true
        flush:
          interval: 1s
        mapping:
          mode: raw
    
    service:
      pipelines:
        logs:
          receivers: [tcplog]
          processors: [attributes]
          exporters: [debug,elasticsearch]

    Resulting document in Elasticsearch:

    {
      "_index": "otel-logs",
      "_id": "jlE4J4wBD9bb2EmZp-Cd",
      "_score": 1,
      "_source": {
        "@timestamp": "1970-01-01T00:00:00.000000000Z",
        "Body": "foo bar baz",
        "Scope": {
          "name": "",
          "version": ""
        },
        "SeverityNumber": 0,
        "TraceFlags": 0,
        "first_attribute": "one",
        "second_attribute": "two"
      }
    }      

Documentation: Documented the new configuration option in the Elasticsearch exporter's README.md.

@ycombinator ycombinator requested review from a team and andrzej-stencel December 1, 2023 22:33
Copy link

linux-foundation-easycla bot commented Dec 1, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Member

@JaredTan95 JaredTan95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator Thanks for your contributions. But we have event in traces(

document.AddEvents("Events", span.Events())
), I think we should also remove the event prefix if this option set to true.

Maybe we can add raw mode in mappging.mode as an optional, becauese the manner of removed prefix does not match to the ecs mode.

@ycombinator
Copy link
Contributor Author

Thanks for the feedback, @JaredTan95. I'll rework this PR to:

  • replace the mapping.omit_attributes_prefix: true setting with the mappging.mode: raw setting, and
  • omit the Attributes. and Events. prefixes both in Elasticsearch document field names if mappging.mode: raw is set.

@ycombinator ycombinator changed the title [exporter/elasticsearch] Add mapping.omit_attributes_prefix configuration option [exporter/elasticsearch] Add mapping.mode: raw configuration option Dec 5, 2023
@ycombinator
Copy link
Contributor Author

@JaredTan95 I've updated this PR based on your feedback. Please re-review when you get a chance. Thanks!

@@ -40,3 +40,5 @@ elasticsearch/log:
max_requests: 5
sending_queue:
enabled: true
mapping:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we added raw mode, we dropped the attributes and events prefixes by default? So this configuration can be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch, thanks.

Copy link
Contributor Author

@ycombinator ycombinator Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4a4a942895bbe69a594772bcf97a048280a982ee.

@ycombinator ycombinator force-pushed the exp-es-omit-attributes-prefix branch 2 times, most recently from 4a4a942 to 7678636 Compare December 13, 2023 20:47
@ycombinator
Copy link
Contributor Author

@JaredTan95 This PR is ready for another round of review. Thanks!

@@ -184,7 +184,7 @@ func TestLoadConfig(t *testing.T) {
MaxInterval: 1 * time.Minute,
},
Mapping: MappingsSettings{
Mode: "ecs",
Mode: "raw",
Copy link
Member

@JaredTan95 JaredTan95 Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding another raw case instead of change ecs? others LGTM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in bcdd9b5a36f4276c18b010516e17f3d73deae511.

Copy link
Member

@JaredTan95 JaredTan95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contributions~

Copy link
Contributor

github-actions bot commented Jan 2, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 2, 2024
@@ -68,6 +68,9 @@ This exporter supports sending OpenTelemetry logs to [Elasticsearch](https://www
- `ecs`: Try to map fields defined in the
[OpenTelemetry Semantic Conventions](https://github.com/open-telemetry/semantic-conventions)
to [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/current/index.html).
- `raw`: Omit the `Attributes.` string prefixed to field names for log and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for a user to want both "ecs" and "raw" functionality? With this configuration, this is not possible.

Looking at the code, I'm not sure the "ecs" mapping mode actually does anything? Is there a functional difference between "none" and "ecs" mapping mode in the current state of the exporter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, I'm not sure the "ecs" mapping mode actually does anything? Is there a functional difference between "none" and "ecs" mapping mode in the current state of the exporter?

Yeah, I don't see mapping mode being used at all (before this PR introduced raw). So I don't think there's any functional difference between none and ecs mapping modes today.

I looked at the PR that introduced the ecs mapping mode — #2324 — but I don't see any explanation for why it is functionally the same as raw. My guess is that ecs was introduced as part of some future-proofing work that was never done?

Given that ecs mapping mode doesn't really do anything today, I can't say that users may want both ecs and raw functionality at the same time. Perhaps we should deprecate ecs (in a separate PR) and later remove it, and then go ahead with this PR introducing raw mapping mode? If so, are there any guidelines around such deprecation and removal (I'm new to the OTel community)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. If the ecs mode does not do anything, then there's no point in enabling both ecs and raw at the same time.

Perhaps we should deprecate ecs (in a separate PR) and later remove it

Agree 💯

and then go ahead with this PR introducing raw mapping mode

I think both changes can be done independently, can't they? We can move on with introducing raw mode in this PR and separately create another PR to deprecate ecs. To put it yet another way, ecs removal does not block introduction of raw in my view.

As to the deprecation and removal process, the ecs mode is currently the default, so I think the following steps make sense:

  1. Deprecate the ecs mode and at the same time change the default from ecs to none - this is a no-op change assuming both modes are functionally the same.
  2. Remove the ecs mode after a long time - perhaps 3 months, perhaps 6 months - I suppose there's no rush here, because having this option doesn't really hurt us, right?

I don't think introducing a feature gate for this removal makes sense, given that the fixing the configuration - removing mode: ecs or changing it to mode: none - is simpler than using a feature gate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @astencel-sumo, I've created #30441 to track the deprecation of mode: ecs and changing the default to mode: none.

I agree we can just go ahead with this PR here introducing mode: raw now itself, independently of what becomes of mode: ecs.

@github-actions github-actions bot removed the Stale label Jan 3, 2024
Copy link
Member

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@andrzej-stencel andrzej-stencel added the ready to merge Code review completed; ready to merge by maintainers label Jan 18, 2024
@ycombinator
Copy link
Contributor Author

Hi @JaredTan95, this PR is ready for merging. I don't have rights to do that; could you merge it please? Thanks!

Copy link
Contributor

github-actions bot commented Feb 6, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 6, 2024
@andrzej-stencel
Copy link
Member

It is not stale, it is ready to merge. Can we get this merged? @open-telemetry/collector-contrib-maintainer

@MovieStoreGuy MovieStoreGuy merged commit aa36c85 into open-telemetry:main Feb 6, 2024
94 checks passed
@github-actions github-actions bot added this to the next release milestone Feb 6, 2024
@ycombinator ycombinator deleted the exp-es-omit-attributes-prefix branch February 6, 2024 22:41
mx-psi pushed a commit that referenced this pull request Apr 5, 2024
**Description:** 

This PR proposes adding @ycombinator as a codeowner for the
`elasticsearch` exporter component, being an [employee of
Elastic](https://www.linkedin.com/company/elastic-co/people/?keywords=shaunak)
and also meeting the codeowner
[requirements](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#requirements):

1. [Be a member of the OpenTelemetry
organization.](https://github.com/open-telemetry/community/blob/main/community-membership.md#member)
   * https://github.com/orgs/open-telemetry/people?query=ycombinator
2. (Code Owner Discretion) It is best to have resolved an issue related
to the component, contributed directly to the component, and/or review
component PRs. How much interaction with the component is required
before becoming a Code Owner is up to any existing Code Owners.
* Resolved
#26647
via
#29619
* Reviewed
#31553
* Contributed
#31694
as follow up to
#31553
* Reviewed
#31848
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/elasticsearch ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Sending raw logs via Elasticsearch exporter when format option is set
5 participants