-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/googlecloud] Exporter for Logs should follow Data Model #16495
Comments
Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Thanks for the detailed issue description @schmikei! I think the main friction here is that the log receivers/transform processors parse to That was an intentional decision (open-telemetry/opentelemetry-log-collection#431) in order to make the operators non-destructive to the original log. Along those same lines, we wanted the exporter to be as non-opinionated as possible, leaning on the availability of other processors to allow users to modify entries as they need to. Could you clarify how this doesn't follow the data model? It wasn't clear to me from the issue description, but based on the logging data model spec, attributes should be matched to GCP labels. Based on that, I think this feature would actually be a change from the current spec.
This was indeed something we overlooked, and fixed in GoogleCloudPlatform/opentelemetry-operations-go#531, which will be in the next release :) |
It's worth noting that parsing to attributes is strongly encouraged by the data model itself, so there is more to it than just preserving the original body. This is actually noted in the linked issue: The incongruity seems to be that, while the json_payload is intended to represent structured data, the data model calls for the body to be unstructured. Therefore, applications and configurations that follow the data model's suggested usage of body will never make use of json_payload, even though they may be representing structured logs. |
Talked about this offline with @braydonk and other gcp folks, we'd like to bring this up at the next SIG meetings (Logging to ask about the data model semantics and Collector to clarify how a change like this would affect the stability/compatibility of our exporter). Personally I think adding a feature flag to our exporter like But, if this is something we just want to address in the data model (ie, no one actually cares about preserving the original string log), then we should just make it the default behavior and avoid the extra hoop of a feature flag. |
(linking) spec update for structuredbody: open-telemetry/opentelemetry-specification#3014 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@schmikei Thanks for the detailed explanation on this issue. I am currently trying to navigate how to align logs coming from the collector with Google Cloud Logging as well. What's the resolution on this? |
@psoladoye, open-telemetry/opentelemetry-specification#3023 came out the discussions around this, which if I remember correctly should have resolved this issue. Perhaps @damemi can check me on that and if it's agreed we could close this. |
@djaglowski Got it. Thanks. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
My understanding of where this issue left off: Google Cloud Exporter maps This was hashed out in a spec meeting where it was clarified that if the source is a structured log already, i.e. a third party structured log, then parsing structured data into the body is perfectly fine. The original intent of "messages in This was addressed in the spec in open-telemetry/opentelemetry-specification#3023 The result of this is that how the Google Cloud Exporter handles this actually makes sense based on the intention of the spec, and that data coming from logging sources should parse structured data into This is where I lost the thread of the issue; I don't remember what the plan was on the logging receiver/processor side. However, since this issue is targeted at the Google Cloud Exporter, and we have determined the exporter behaviour here makes sense with the spec, I am tempted to say this issue can be closed. |
Component(s)
exporter/googlecloud
Describe the issue you're reporting
The Main Issue
Since the Google Cloud Exporter is treating the Body of the message differently than other exporters, it makes receivers/users have to be cognizant of which exporter they are going to and cannot be universally applied to all telemetry pipelines.
The main issues that are strongly apparent is that:
jsonPayload
field and only will ever usetextPayload
Current Behavior of the Exporter
Currently the Google Cloud exporter maps Body to GCL
textPayload
orjsonPayload
, and Attributes is mapped tolabels
. From a UX perspective, this is undesirable, because the behavior of first-party receivers is to keep the raw log on Body, while parsing to attributes. This often leads totextPayload
being used, where a customer expects their parsed log parts to be mapped to ajsonPayload
. This is especially undesireable when the parsed attributes contain nested fields (e.g. parsing a json payload which may be arbitrarily nested).Data model
The current data model strongly recommend that the body of an OTel log be a String for first-party receivers. This is generally how the log receivers in contrib are built, with parsed info for a log line going to the attributes. For instance, this is the default behavior of the
pkg/stanza
based parsing operators. In the Google cloud exporter, these are currently mapped to labels.However, when collecting and sending logs using other agents, e.g. the Ops Agent, these parsed attributes are normally mapped to
jsonPayload
.Proposed solution
Ultimately the maintainers should propose a solution they feel good about, however here are some alternatives that we have considered.
Instead of mapping Attributes to labels, instead combine Body and Attributes together, and use the combined map for
jsonPayload
instead.One more thing we can do to better fit the LogEntry model of Google may be to add Resource Attributes to the
labels
result.Examples
Parsing a mongodb log
This example highlights that just for a regex parsed log entry, it by default maps parsed attributes to labels, for more information on the entry model please see here.
Config
Log Entry
Output
Example Output of MongoDB Logs from Ops Agent
Even the official agent of Google parses these logs and puts them into
jsonPayload
.Parsing an Elasticsearch audit log
Because attributes are mapped to labels, parsing bodies with nested fields can make results difficult to search in GCL, since all label values must be strings:
Config
Original Log
The nested keys are turned into strings when placed on labels, making them impossible to match on properly:
Output
The Loki exporter, for example, preserves the nested keys, making them matchable:
Loki
The text was updated successfully, but these errors were encountered: