Description
openedon Aug 20, 2024
Abstract
Provide a brief summary of the RFC's purpose.
The goal of the RFC is to standardize the name and the content of the azure-eventhub
field.
Introduction
Explain the background, context, and motivation for the proposal.
Since its inception five years ago, the azure-eventhub
input stored the "event hub metadata" (event hub name, consumer group, offset, and more with the input v2) in the azure
field of type object
.
However, since many integrations use the azure
field as the root element for their specific fields (i.e. azure.activitylogs
, etc), these integrations usually rename the azure
field with the metadata as azure-eventhub
to keep the metadata alongside the actual data.
Here is an example:
{
"azure-eventhub": {
"sequence_number": 21916518,
"partition_id": "1",
"consumer_group": "$Default",
"offset": 9955743838336,
"eventhub": "mbranca815",
"enqueued_time": "2024-08-20T09:10:01.486Z"
}
}
Here are a few integrations that rename azure
field with metadata into azure-eventhub
:
- activitylogs
- auditlogs
- eventhub (generic integration)
- graphactivitylogs
- identity_protection
- platformlogs
- provisioning
- signinlogs
- springcloudlogs
- azure_app_service
And others who do not rename the field:
- application_gateway
- firewall_logs
- azure_functions
- azure_frontdoor
- azure_openai
The older integrations perform the rename azure > azure-eventhub
, but the more recent integrations do not.
There are at least two practical problems here:
- The input stores the metadata in a field that most integrations rename as the first step in the default pipeline.
- All recent integrations do not rename the field, creating inconsistencies and potential conflicts.
Proposal
Detail the proposed changes, including technical specifications, diagrams, and examples if necessary.
I suggest:
- Adopting the current defacto standard name
azure-eventhub
as the official metadata field name. - Documenting all the existing field content.
- Change the input to store the metadata in the
azure-eventhub
field. - Change the input to make the
azure-eventhub
field optional to save storage, if required (default enabled). - Make sure all existing integrations work with
azure-eventhub
field.
Existing field content
The metadata field contains the following information.
Field | Description | Notes |
---|---|---|
azure-eventhub.eventhub |
Event hub name | |
azure-eventhub.consumer_group |
Name of the consumer group | |
azure-eventhub.enqueued_time |
Timestamp of the time the message was published on the event hub | |
azure-eventhub.offset |
Message offset in the event hub partition | |
azure-eventhub.sequence_number |
Message sequence number in the event hub partition | |
azure-eventhub.partition_id |
The partition ID of the message | since v2 |
azure-eventhub.partition_key |
The partition key of the message | since v2 (optional) |
Rationale
Justify the proposal by discussing the problem it solves and why this solution is chosen over alternatives.
Name
- It is used for the majority of integrations.
- It is backward compatible.
- Since it's the same name as the input, conflicts are probably low.
If I could go back in time when the input was created, with today's experience I would call this field something like azure_eventhub_metadata
. However, the azure-eventhub
is good enough to represent the semantics.
Changing the field name would cause a breaking change that doesn't feel worth it, given the secondary role of the metadata field from the users' perspective.
Impact
Describe the expected impact on users, systems, and any potential side effects.
Since all integrations will use azure-eventhub
field, we expect a reduction in mapping conflicts from
the azure
field.
Security Considerations
Address any security implications of the proposal.
No security implications so far.
Backward Compatibility
Explain any effects on existing systems or versions.
We need to double-check if the rename processor in the existing integrations works correctly when there is no azure
field in the message.
Implementation
Outline the steps needed for implementation, including timelines, milestones, and responsible parties.
Conclusion
Summarize the key points and restate the importance of the proposal.
Key Points Summary
-
Proposal Purpose: Standardize the
azure-eventhub
field name and content across integrations for consistency. -
Background: Historical inconsistencies arose as the
azure
field was renamed toazure-eventhub
in various implementations, causing confusion. -
Current Issues: Varied naming has led to difficulties in field mappings and increased conflict risks among older and newer integrations.
-
Proposed Changes: Adoption of
azure-eventhub
as the official field name, documentation of existing field content, making the field optional, ensuring backward compatibility. -
Expected Impact: Reducing mapping conflicts and enhancing harmony across diverse integrations through standardization.
-
Implementation Steps: Clear plan for execution, including updates to input settings, adding rename processors, and documenting existing metadata.
Importance of the Proposal
- Ensures consistency and clarity in handling Azure Event Hub metadata across integrations.
- Addresses ongoing conflicts, improving ease of integration across the ecosystem.
References
List any external references or documents cited in the RFC.