Ability to differentiate between nested fields and those whose names contain .
#1793
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference: https://issues.couchbase.com/browse/MB-55699
bleve uses "." as the path separator for nested field names.
This can conflict with those fields whose names contains "."
within them - which is an allowed parameter.
So if one were to index a document such as this ..
The field "x.y" will contain tokens "1" and "2". The real
problem seeps in here when different analyzers are used for
these 2 fields - and during search time, an analytic query
will not be able to accurately pick an analyzer to apply
over the search criteria. Also problematic is when the
data types for the 2 fields are different.
The proposal here is to decorate field names under the hood
within backticks to preserve their true meaning.
So for example ..
`a.b`is a single unnested field name`a`.`b`is a nested field name with`b`being a child field of`a`Here're the ramifications with this approach:
While indexing, users can still specify fields names as
they appear in their JSON documents. Underneath the hood
however, these field names will now be registered with
their decorated versions to avoid ambiguity.
While querying, users can still specify fields as they
expect to see them within their json documents. Note that,
it will be the user's responsibility to differentiate
between nested field names and others.
Let's consider an index mapping that indexes the document
used earlier^. The searches that'd work here are ..
1.
{"field": "`x.y`", "match": 2}2.
{"field": "x.y", "match": 1}3.
{"field": "`x`.`y`", "match": 1}Users will also be responsible for specifying sort keys,
facet fields, highlight fields accordingly in their search
requests. For example ..
In the search response, users will now see decorated
names for fragments, locations and facets to avoid any
ambiguous interpretation of the field names.