Ability to differentiate between nested fields and those whose names contain `.` #1793

abhinavdangeti · 2023-02-24T21:49:16Z

Reference: https://issues.couchbase.com/browse/MB-55699
bleve uses "." as the path separator for nested field names.
This can conflict with those fields whose names contains "."
within them - which is an allowed parameter.
So if one were to index a document such as this ..
```
  {
      "x": {
          "y": "1"
      },
      "x.y": "2"
  }
```
The field "x.y" will contain tokens "1" and "2". The real
problem seeps in here when different analyzers are used for
these 2 fields - and during search time, an analytic query
will not be able to accurately pick an analyzer to apply
over the search criteria. Also problematic is when the
data types for the 2 fields are different.
The proposal here is to decorate field names under the hood
within backticks to preserve their true meaning.
So for example ..
- `a.b` is a single unnested field name
- `a`.`b` is a nested field name with `b` being a child field of `a`
Here're the ramifications with this approach:
- While indexing, users can still specify fields names as
  they appear in their JSON documents. Underneath the hood
  however, these field names will now be registered with
  their decorated versions to avoid ambiguity.
- While querying, users can still specify fields as they
  expect to see them within their json documents. Note that,
  it will be the user's responsibility to differentiate
  between nested field names and others.
  Let's consider an index mapping that indexes the document
  used earlier^. The searches that'd work here are ..
  1. {"field": "`x.y`", "match": 2}
  2. {"field": "x.y", "match": 1}
  3. {"field": "`x`.`y`", "match": 1}
- Users will also be responsible for specifying sort keys,
  facet fields, highlight fields accordingly in their search
  requests. For example ..
```
  x        : interpreted as `x`
  `x`      : interpreted as `x`
  x.y      : interpreted as `x`.`y`
  `x.y`    : interpreted as `x.y`
  `x`.`y`. : interpreted as `x`.`y`
```
- In the search response, users will now see decorated
  names for fragments, locations and facets to avoid any
  ambiguous interpretation of the field names.

+ Reference: https://issues.couchbase.com/browse/MB-55699 + bleve uses "." as the path separator for nested field names. This can conflict with those fields whose names contains "." within them - which is an allowed parameter. + So if one were to index a document such as this .. ``` { "x": { "y": "1" }, "x.y": "2" } ``` The field "x.y" will contain tokens "1" and "2". The real problem seeps in here when different analyzers are used for these 2 fields - and during search time, an analytic query will not be able to accurately pick an analyzer to apply over the search criteria. + The proposal here is decorate field names under the hood within backticks to preserve their true meaning. So for example .. - ``` `a.b` ``` is a single unnested field name - ``` `a`.`b` ``` is a nested field name with ``` `b` ``` being a child field of ``` `a` ``` + Here're the ramifications with this approach: - While indexing, users can still specify fields names as they appear in their JSON documents. Underneath the hood however, these field names will now be registered with their decorated versions to avoid ambiguity. - While querying, users can still specify fields as they expect to see them within their json documents. Note that, it will be the user's responsibility to differentiate between nested field names and others. Let's consider an index mapping that indexes the document used earlier^. The searches that'd work here are .. 1. ```{"field": "`x.y`", "match": 2}``` 2. ```{"field": "x.y", "match": 1}``` 3. ```{"field": "`x`.`y`", "match": 1}``` - Users will also be responsible for specifying sort keys, facet fields, highlight fields accordingly in their search requests. For example .. ``` x : interpreted as `x` `x` : interpreted as `x` x.y : interpreted as `x`.`y` `x.y` : interpreted as `x.y` `x`.`y`. : interpreted as `x`.`y` ``` - In the search response, users will now see decorated names for fragments, locations and facets to avoid any ambiguous interpretation of the field names.

iredmail · 2023-02-25T03:40:03Z

Is this change backward-compatible?

abhinavdangeti · 2023-02-25T14:57:14Z

Not yet, need to think about it.

abhinavdangeti added this to the v2.3.7 milestone Feb 24, 2023

abhinavdangeti force-pushed the mb55699 branch from 2b2d531 to 99aa596 Compare February 24, 2023 21:59

abhinavdangeti added 2 commits February 24, 2023 15:17

MB-55699: Unit test demo-ing new behavior

c4245c8

abhinavdangeti force-pushed the mb55699 branch from 99aa596 to c4245c8 Compare February 24, 2023 22:24

abhinavdangeti requested review from Thejas-bhat, metonymic-smokey and moshaad7 February 24, 2023 22:26

abhinavdangeti modified the milestone: v2.4.0 Feb 27, 2023

abhinavdangeti changed the title ~~Ability to differentiate between nested fields and those with .~~ Ability to differentiate between nested fields and those whose names contain . May 4, 2023

abhinavdangeti added the do not merge label Sep 7, 2023

abhinavdangeti removed this from the v2.4.0 milestone Oct 31, 2023

abhinavdangeti marked this pull request as draft October 31, 2023 18:13

abhinavdangeti removed request for Thejas-bhat, metonymic-smokey and moshaad7 May 15, 2024 18:49

abhinavdangeti added the breaking-change label Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ability to differentiate between nested fields and those whose names contain `.` #1793

Ability to differentiate between nested fields and those whose names contain `.` #1793

Uh oh!

abhinavdangeti commented Feb 24, 2023 •

edited

Loading

Uh oh!

iredmail commented Feb 25, 2023

Uh oh!

abhinavdangeti commented Feb 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ability to differentiate between nested fields and those whose names contain . #1793

Are you sure you want to change the base?

Ability to differentiate between nested fields and those whose names contain . #1793

Uh oh!

Conversation

abhinavdangeti commented Feb 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iredmail commented Feb 25, 2023

Uh oh!

abhinavdangeti commented Feb 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ability to differentiate between nested fields and those whose names contain `.` #1793

Ability to differentiate between nested fields and those whose names contain `.` #1793

abhinavdangeti commented Feb 24, 2023 •

edited

Loading