Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] allow field names with dots in the name #11749

Open
rursprung opened this issue Jan 4, 2024 · 2 comments
Open

[Feature Request] allow field names with dots in the name #11749

rursprung opened this issue Jan 4, 2024 · 2 comments
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search Indexing Indexing, Bulk Indexing and anything related to indexing Search:Query Capabilities

Comments

@rursprung
Copy link
Contributor

Is your feature request related to a problem? Please describe

sometimes we need to index documents where the field names contain dots but shouldn't be turned into subobjects (e.g. foo.bar should stay foo.bar and not become a field foo containing a nested field bar).
this is esp. relevant when there's a second field which contains part of the path of the other (e.g. here foo) which then wouldn't work.

Describe the solution you'd like

it must be possible to define that some fields should not be evaluated as subobjects.

Related component

Indexing

Describe alternatives you've considered

one workaround is to append a custom suffix to the fields which should (hopefully) not be part of the path of others, e.g. instead of foo and foo.bar one could use foo_x and foo.bar_x (thus avoiding the name collision with foo).

Additional context

Elasticsearch introduced the possibility to define "subobjects": false in ES 8.x

@rursprung rursprung added enhancement Enhancement or improvement to existing feature or request untriaged labels Jan 4, 2024
@github-actions github-actions bot added the Indexing Indexing, Bulk Indexing and anything related to indexing label Jan 4, 2024
@msfroh
Copy link
Collaborator

msfroh commented Jan 4, 2024

This should be doable.

I believe we need to add a new property to ObjectMapper, which will be inherited by RootObjectMapper. Based on that property, we can decide how we resolve sub-mappers (whether we try to resolve sub-objects for dots or not).

(I could be wrong about the above, but that's where I would start looking to work on this.)

@stowns
Copy link

stowns commented May 13, 2024

We have the same request though it's for a different reason. We are trying to index json objects which have potentially large nested values. flat_object was not working for us because some of the nested values exceeded 32kb which throws an error in lucene as flat_object's nested fields get indexed as keywords (and those can't exceed 32kb). As a work-around I tried to flatted all objects before indexing and have each nested field treated as text with dynamic_templates ie)

"dynamic_templates": [
  {
    "context": {
      "path_match": "context.*",
      "mapping": {
        "type": "text"
      }
    }
  },
  {
    "message": {
      "path_match": "message.*",
      "mapping": {
        "type": "text"
      }
    }
  }
]

However, now I'm experiencing

class org.opensearch.index.mapper.TextFieldMapper cannot be cast to class org.opensearch.index.mapper.ObjectMapper

because the dots in the keys are being expanded into an object :( .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search Indexing Indexing, Bulk Indexing and anything related to indexing Search:Query Capabilities
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

3 participants