-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Opensearch dynamic index not working with nested fields #2259
Comments
@skylagallm , Data Prepper uses JSON Pointer syntax. So using it uses slashes instead of dots. In your example, |
Hi divenable, thanks for your support. Probably it's my fault, I have not explained well the scenario.
As you can see, the interested field is a top-level field, using dots inside the name. By the way I have also made a test right now with the following config:
By the way this is not working. |
Hi @skylagallm, If you are using an Amazon OpenSearch domain, then you also need to provide auth ( The other As far as your use case with the dynamic indices, I have locally tested and confirmed that this configuration works and creates an index named
The index
I have also confirmed that setting
The index created in OpenSearch was named
So whether the key is top-level or nested, your use case should be covered. It might help if you were able to share the json Event that is input to data prepper, or by just debugging to the |
Hi @graytaylor0, Thanks for your effort in trying to replicate locally the behaviour; btw now I am a bit confused. I will try to summarize my findings: Let's say that this (as returned by the stdout sink, without any processing) is the input of my trace/service map pipeline:
The latest field If I try this setup:
This is working perfectly! (Also the indexing inside opensearch leveraging on the IAM roles linked to the data prepper ec2). But, as you can see from the input shown above, the field resource.attributes.tenant already exists, but without the add_entries processor, the same exact config is not working for me. This seems like that the Opensearch sink is not able to access previously existing fields inside the event. But this is still wrong, because if I try to create the index using a topLevel field without any dots inside the name, the config now works (same input doc as above):
I have also tried some workaround, like copying the value from the resource.attributes.tenant field to an another field called "test", but again this is not working. Many thanks for your help Marco |
@skylagallm , if you use |
I wanted to put a comment on here because I am seeing similar behavior, albeit not in the same exact context. I am trying to set up some routes for metrics. Under certain conditions the data will go to different indices. When using a root attribute such as serviceName, I am able to successfully filter and route records. Example
But when I try to do this with any other attribute not in the root, the routing is not applied. All records meeting the criteria or not either go to the destination or are blocked. I had actually tried many more configs than just the ones below with no success. Example
I also attempted this with other processors such as copy_values and was not able to get the behavior to work either. I think some of the confusion here is stemming from the data source. I see working examples using a log generator. In my case and it also appears in @skylagallm case we are getting data from OTEL. Something about the serialization handling here is preventing the plugins from reading or accessing these attributes. |
+1 to @fkirwin's issue regarding conditional routing. I'm also trying to setup routing logic based on span attributes from an OTEL data source. My routes are defined like so: route:
- super-tenant: '/span.attributes.tenant == "super"'
- other-tenant: '/span.attributes.tenant != "super"' When I run this in my dev environment, I see data prepper logs like this:
I'm not sure how the |
@kjorg50 Data Prepper flattens the "attributes" and puts them in the parent object. So, if you are trying to match any fields inside the attributes, use it without the attributes. So try something like |
I also am having the same issue as @kjorg50 and others in this ticket. When parsing opentelemetry attributes for spans or resources they get transformed within the OtelProtoCodec into dot notation: Line 99 in e1ea5e1
As the opening of this ticket describes, its straightforward to reproduce this by utilizing an OpenTelemetryCollector that generates attributes:
As far as I can tell that makes them impossible to address utilizing the conditional routing feature, as the antlr parser fails on the dot notation. I additionally attempted to get around this by renaming the fields within the processor using the copy_value functionality. Addressing keys also does not work and the from_key will never be read.
While I appreciate @graytaylor0 's efforts earlier to show how these operations function, they all work for me when I am using other pipelines, its the opentelemetry attributes that appear the problem. I can open a separate ticket for these if it makes sense, but they all appear somewhat related. How can dataprepper get a pointer to an opentelemetry attribute in an event under any component of a pipeline? |
Following up to help others experiencing this issue: Attributes parsed out of OTEL input are transformed in the following way - They are appended with So, to access a resource attribute such as |
@nickrab You are a life saver! I spent a week trying to figure out how to access/reference these attribute values in data prepper before stumbling across your reply here. Adding a little more help for others who might come across this. Looking at the resulting documents saved in OpenSearch, it's not obvious the path to some of the fields you might want to access in a processor. @nickrab's reply inspired me to start digging through the data prepper code, and I found the data prepper span interface which will show you what fields are available on the underlying event object. Digging a little further, we can look at OTelProtoCodec.parseSpan() which takes an OTEL span and converts it to the data prepper span. The key is where it builds the attributes, and you can see it is merging the span, resource, instrumentation scope, and status "attributes" into the single top-level object. I went nuts trying to figure out how to remove the otel-traces-pipeline:
processor:
- otel_traces:
- delete_entries:
with_keys:
- "attributes/instrumentationScope.version"
- "attributes/instrumentationScope.name"
- "attributes/status.code"
- "attributes/status.message" |
Hi @dlvenable any update around this? @nickrab @patrick-fa I'm using Amazon Opensearch, can you confirm me how I have to compose the json path of the variable. I have already tried these three options indexName: otel-logs-${/resource/attributes/service.namespace}-${/resource/attributes/service.name}-%{yyyy.MM} indexName: otel-logs-${/resource/attributes/service@namespace}-${/resource/attributes/service@name}-%{yyyy.MM} otel log file looks something like this. {
"resourceLogs": [
{
"resource": {
"attributes": [
{
"key": "service.namespace",
"value": {
"stringValue": "qa"
}
},
{
"key": "service.name",
"value": {
"stringValue": "app"
}
}
]
},
"scopeLogs": [
{
"scope": {},
"logRecords": [
{
"observedTimeUnixNano": "1722943858480949454",
"body": {
"stringValue": "log"
},
"attributes": [
{
"key": "log.file.path",
"value": {
"stringValue": "/var/log/messages"
}
},
{
"key": "log.file.name",
"value": {
"stringValue": "messages"
}
},
{
"key": "application",
"value": {
"stringValue": "app"
}
},
{
"key": "env",
"value": {
"stringValue": "qa"
}
},
{
"key": "region",
"value": {
"stringValue": "eu-west-2"
}
},
{
"key": "ui_version",
"value": {
"stringValue": "v1"
}
},
{
"key": "version",
"value": {
"stringValue": "v1.0"
}
}
],
"traceId": "",
"spanId": ""
}
]
}
]
}
]
} |
I confirm that this works Thanks @nickrab indexName: otel-logs-${/attributes/resource.attributes.service@namespace}-${/attributes/resource.attributes.service@name}-%{yyyy.MM} |
@patrick-fa thank you for your in-depth explaining from the source code, it help me dealing with the oltp log as well.
|
Describe the bug
Using the latest 2.1.0 version, with Opensearch dynamic index, Data prepper is not working when trying to reference a nested field inside the event.
To Reproduce
Steps to reproduce the behavior:
otel will insert this custom key as:
resource.attributes.tenant
Expected behavior
A clear and concise description of what you expected to happen.
I would expect Data prepper to create an index named otel-v1-apm-span-test-2023.02.09, but it doesn't.
By the way, If I try to use a top level field like "serviceName" all is working fine.
The text was updated successfully, but these errors were encountered: