Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[azure_logs] Add Custom Azure Logs input package #11552

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

zmoog
Copy link
Contributor

@zmoog zmoog commented Oct 28, 2024

Proposed commit message

Add a new input package to collect custom logs from Event Hub.

This input package is a thin wrapper on the azure-eventhub input that allows users to receive JSON log events from an event hub and transform them using custom pipelines and mappings.

Users should use this input package instead of generic integration in the Azure Logs integration.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • [ ] I have verified that any added dashboard complies with Kibana's Dashboard good practices

Author's Checklist

  • Add pipeline override
  • Replace the logo with the one used in all input packages
  • Write the docs (overview, requirements, setup)
  • How to deal with routing

Reviewer's Checklist

Please review the following items and, ideally, leave your thoughts in the comments.

  • Routing: I could not find a way to add an OOTB pipeline to route logs to other data streams. Do you know a method?
  • Logo: all the input packages use the same logo. Should we adopt the same logo or use something Azure-specific?
  • Data model: Currently, the input package stores the log events sent by the input without any additional change. The log event contains the JSON document as a string in the message field. This gives users all the options to transform the log event in the custom pipeline. Do you have any suggestions?
  • Ownership: I currently set the package ownership to elastic/obs-ds-hosted-services. Should we set the ownership to elastic/obs-infraobs-integrations?

Related issues

Screenshots

CleanShot 2024-10-30 at 16 37 42@2x

CleanShot 2024-11-06 at 10 40 10@2x

CleanShot 2024-10-30 at 16 39 26@2x

CleanShot 2024-10-30 at 16 42 28@2x

CleanShot 2024-10-30 at 16 44 44@2x

@zmoog zmoog self-assigned this Oct 28, 2024
@zmoog zmoog added enhancement New feature or request Team:obs-ds-hosted-services Label for the Observability Hosted Services team [elastic/obs-ds-hosted-services] Integration:azure_logs Custom Azure Event Hub Input [Integration not found in source] labels Oct 28, 2024
Death by 1000 copy & paste
It seems all other integrations use Elastic-2.0
With this pipeline override, users can run a custom pipeline only
for a specific input package installation.
All other input packages seem to use this logo.
It seems we cannot build OOTB routing into an input package
I don't think we can (or want) offer this option.
@zmoog zmoog marked this pull request as ready for review October 30, 2024 15:52
@zmoog zmoog requested review from a team October 30, 2024 15:57
@zmoog
Copy link
Contributor Author

zmoog commented Oct 30, 2024

cc @lalit-satapathy for the ownership topic in the reviewer's checklist

packages/azure_logs/manifest.yml Show resolved Hide resolved
packages/azure_logs/manifest.yml Outdated Show resolved Hide resolved
packages/azure_logs/docs/README.md Outdated Show resolved Hide resolved
packages/azure_logs/docs/README.md Outdated Show resolved Hide resolved
zmoog and others added 5 commits October 30, 2024 23:40
Co-authored-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
Co-authored-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
- it's consistent with the equivalent AWS integration
- end users a probably not aware of the input vs. integration
zmoog and others added 3 commits November 5, 2024 17:29
Co-authored-by: muthu-mps <101238137+muthu-mps@users.noreply.github.com>
Co-authored-by: muthu-mps <101238137+muthu-mps@users.noreply.github.com>
It seems we always use the name to refer to the service name, so
capitalizing it seems the right choice.
@zmoog
Copy link
Contributor Author

zmoog commented Nov 5, 2024

Questions,

  • How do we make a shift from the Event Hub input integration?
  • Are we going to keep both or deprecate the Azure EventHub input integration?

My current proposal is:

  1. Ship this input package.
  2. Publish a blog post that explains how the input package works and why it's better than the Azure EventHub input integration (the generic integration in Azure Logs).
  3. Deprecate the Azure EventHub input integration in Azure Logs, linking to the input package and blog post.

Then, move to Azure Logs and update it to use only one input.

WDYT?

@zmoog
Copy link
Contributor Author

zmoog commented Nov 6, 2024

There is a full round of updates, and the PR is ready for another round of review.

  • @muthu-mps, @Kavindu-Dodan, please take another look at the latest changes and see if they address your comments! 🙇
  • @alaudazzi, could you take a look at the integration docs in the README file? 🙇

@muthu-mps
Copy link
Contributor

Questions,

  • How do we make a shift from the Event Hub input integration?
  • Are we going to keep both or deprecate the Azure EventHub input integration?

My current proposal is:

  1. Ship this input package.
  2. Publish a blog post that explains how the input package works and why it's better than the Azure EventHub input integration (the generic integration in Azure Logs).
  3. Deprecate the Azure EventHub input integration in Azure Logs, linking to the input package and blog post.

Then, move to Azure Logs and update it to use only one input.

WDYT?

One more point to add here, The Azure Eventhub input can get deprecated once the input package is moved to GA.

- observability
conditions:
kibana:
version: "^8.13.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a new package, should we use the latest version of the stack?

Copy link
Contributor Author

@zmoog zmoog Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not offering input v2 support in this version, I thought we could:

  • support 8.13+ users during the technical preview/beta phase (0.x)
  • ship a GA version (1.0) shortly after
  • Then, ship an update (1.1) requiring 8.15+ with input v2 support.

WDYT?

@zmoog
Copy link
Contributor Author

zmoog commented Nov 7, 2024

Questions,

  • How do we make a shift from the Event Hub input integration?
  • Are we going to keep both or deprecate the Azure EventHub input integration?

My current proposal is:

  1. Ship this input package.
  2. Publish a blog post that explains how the input package works and why it's better than the Azure EventHub input integration (the generic integration in Azure Logs).
  3. Deprecate the Azure EventHub input integration in Azure Logs, linking to the input package and blog post.

Then, move to Azure Logs and update it to use only one input.
WDYT?

One more point to add here, The Azure Eventhub input can get deprecated once the input package is moved to GA.

Yep, I agree.

@muthu-mps
Copy link
Contributor

@zmoog - Do you think we need to use the build.yml file to add ECS reference? Not sure if this is not the case for the newer integrations after @ecsmappings changes.

Copy link
Contributor

@alaudazzi alaudazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some editing suggestions, otherwise the text LGTM.


The integration sets up a dedicated index template named `logs-mydataset` with the `logs-mydataset-*` index pattern. You can then customize it using a custom pipeline and custom mappings.

Custom Logs integrations give you all the flexibility you need to configure the integration to your needs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Custom Logs a name for general integrations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in this case, I am referring to integration or input packages designed to collect custom logs. With "custom logs", I mean any log format (usually JSON document) that may come from any source.

Maybe I should focus on the Custom Azure Logs integration only and change this phrase into:

The Custom Azure Logs integration gives you all the flexibility you need to collect, parse, and map log events to your needs.

WDYT?

packages/azure_logs/docs/README.md Outdated Show resolved Hide resolved
packages/azure_logs/docs/README.md Show resolved Hide resolved
packages/azure_logs/docs/README.md Show resolved Hide resolved
packages/azure_logs/docs/README.md Show resolved Hide resolved
packages/azure_logs/docs/README.md Outdated Show resolved Hide resolved
packages/azure_logs/docs/README.md Outdated Show resolved Hide resolved
packages/azure_logs/docs/README.md Outdated Show resolved Hide resolved
packages/azure_logs/manifest.yml Outdated Show resolved Hide resolved
packages/azure_logs/manifest.yml Outdated Show resolved Hide resolved
@alaudazzi
Copy link
Contributor

alaudazzi commented Nov 7, 2024

@zmoog
A more general comment about capitalization regarding these terms:

  • Consumer group
  • Agents
  • event hub

Are they capitalized when they refer to something specific and not capitalized when referring to something generic? You might want to double-check when capitalization applies.

zmoog and others added 7 commits November 7, 2024 12:01
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com>
@zmoog
Copy link
Contributor Author

zmoog commented Nov 7, 2024

A more general comment about capitalization regarding these terms:

  • Consumer group
  • Agents
  • event hub

Are they capitalized when they refer to something specific and not capitalized when referring to something generic? You might want to double-check when capitalization applies.

I'll review the doc to see if I am not applying the rules consistently, but here are the rules I'm using now (please highlight any wrong usage!)

Consumer group: It's a generic term, so I capitalize "Consumer" only if required by the context (start of a sentence or heading).

Agents: I capitalize it when it means "Elastic Agent" and I leave it lowercase when it's the generic term. Should I use the full name ("Elastic Agent(s)") to avoid ambiguity?

event hub: see the comment #11552 (comment)

@zmoog
Copy link
Contributor Author

zmoog commented Nov 7, 2024

@zmoog - Do you think we need to use the build.yml file to add ECS reference? Not sure if this is not the case for the newer integrations after @ecsmappings changes.

That's a good point. I missed the build.yml file. I'm double-checking, but I should probably include it in this integration to state which version of ECS the integration fields should be tested with.

Copy link

Quality Gate failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube

@elasticmachine
Copy link

💚 Build Succeeded

History

cc @zmoog

Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

The top level fields like resource_id can be parsed from the message field. This is expected to be available in all the events. This can be taken in next release.

@zmoog
Copy link
Contributor Author

zmoog commented Nov 7, 2024

The top level fields like resource_id can be parsed from the message field. This is expected to be available in all the events. This can be taken in next release.

I didn't want to make any assumptions about the message structure. From public issues, I learned that users use the generic integration to ingest non-Azure log payloads like IoT data or third-party application logs (for example, MuleSoft and ConfluentCloud logs).

But we can, of course, revisit this and add standard Azure field mappings.

@zmoog
Copy link
Contributor Author

zmoog commented Nov 7, 2024

@muthu-mps, do you confirm your team elastic/obs-infraobs-integrations is okay with taking ownership of this input?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Integration:azure_logs Custom Azure Event Hub Input [Integration not found in source] New Integration Team:obs-ds-hosted-services Label for the Observability Hosted Services team [elastic/obs-ds-hosted-services]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants