Open
Description
This meta issue tracks some of the improvements and cleanups that we identified while working on the runtime fields project (#59332):
- Remove the last usage of
SearchContext#mapperService
: it is used for nested objects - Remove
QueryShardContext#getObjectMapper
: it is mostly used for nested objects and it indirectly exposes retrievingMappedFieldType
going throughFieldMapper
s, which are not aware of runtime fields - Clean up
SearchLookup
handling inSearchExecutionContext
: a separate lookup object needs to be created for the fetch phase which is passed through toMappedFieldFype#valueFetcher
, yet the (wrong) lookup can also be retrieved directly from theSearchExecutionContext
- Revise
NestedScope
onSearchExecutionContext
- Clean up special handling for
allowUnmappedFields
andmapUnmappedAsText
inSearchExecutionContext
, they can be changed from consumers which seems dangerous and are only needed for the percolator. - Review null checks on what
SearchExecutionContext#getFieldType
returns. They most likely need to be converted to useQueryShardCOntext#isFieldMapped
instead which does not take theallowUnmappedField
andmapUnmappedAsString
flags into account - Review calls to
FieldsVisitor#postProcess
and make sure that they are all necessary, in some cases we may be able to skip the call as it does nothing depending on which fields we are processing - Currently, when an index is created without providing mappings it will have a null document mapper until mappings are provided for it, or a document has indexed in it.
MapperService
exposes theDocumentMapper
through which the mappings can be accessed. The fact thatDocumentMapper
can be null leads to null checks in many places and potential NPEs in callers. We'd like to eagerly create a document mapper for a new index, as soon as it gets created. This has the implication that we would lose the semantics around the fact that when a document mapper is null for an index, it means the index is empty.
That would also allow us to apply index sorting when the index is created instead of delaying it to the document mapper creation. - We have a special case to handle the case of an empty doc being indexed, which triggers a dynamic mapping update although the mappings are empty, to ensure that whenever a doc is registered in an index, it has mappings and its corresponding document mapper will not be null then. This special case can be removed if we make sure that every index always has mappings
- Should we move
MapperRegistry
toindex.mapper
package? (Move MapperRegistry to index.mapper package #69805) -
Mapping
is visible from outside of its package but all of its methods are called from within the same package. Review usages and possibly adjust methods visibility accordingly. Also, its instance fields are all package private and accessed directly without going through getters, yet the root object mapper has also a public getter that is only used within the same package. (Remove redundant methods from DocumentMapper #69803) - Make
ContentPath
less confusing - Rework
ObjectMapper
andRootObjectMapper
to not rely onclone
- Make MappedFieldType responsible for building their own field caps objects, and add in parent/nested information
- Reduce the number of
merge
methods onMapperService
, and clarify when they should be called - Merging does not require the entire
MapperService
: ideally all of the merge methods exposed byMapperService
can be factored out to a lightweightMappingMerger
component (merge itself was already removed fromDocumentMapper
and replaced by callingMapping#merge
directly). This means that it will no longer be required to build a fullMapperService
in the master node to execute merges when the indices are not allocated to it. - Rework
NestedDocuments
to depend onMappingLookup
Use a mapping snapshot for fetching nested docs #66877 (@nik9000) - can we remove get field mappings API? or shall we refactor it to make some sense? see also Support include_defaults in the GET-mapping API #5368
- Can we remove getMapper and fieldMappers from MappingLookup? Why would one have to go through mappers by name? Can we consolidate how to access fields in the mappings? What needs access to mappers and objectmappers? Can we have a unified lookup structure?
- Do we even need a DocumentMapper given that MappingLookup became almost a copy of it? We should probably use MappingLookup whenever possible and re-evaluate what we need DocumentMapper for.\
- Rewrite tests (e.g. MapperServiceTestCase) to rely less, or not at all, on DocumentMapper. Many tests build a MapperService only to use a small portion of it, or merge mappings, or retrieve document mapper from it.
Caching cleanups
- Remove
MapperService
fromSearchExecutionContext
entirely. We'd prefer there not be an easy way for the context to get a new snapshot of the mapping. OTOH we don't really want to merge all of the immutable mapping stuff intoMappingLookup
. We have to decide how to remove it. - Remove
SearchContext
's reference toMapperService
. This includes exposingNestedDocuments
inSearchExecutionContext
so thatDefaultSearchContext
takes it from there. And cleaning up access toindexService.mapperService()
inDefaultSearchContext
- Replace or remove the parsing function passed to
MappingLookup
, which hides a reference toDocumentMapper
. - Pipe all volatile read methods on
MapperService
through#mappingLookup()
to make sure we perform a single volatile read. Or we could remove them entirely in favor the caller going throughmappingLookup()
. If the caller keeps themappingLookup
for the duration of the operation then the caller will get a consistent snapshot of the mapping during the entire process. This can potentially become a big change but it can be done in steps, for instance we can start from duplicated methods that are exposed inMapperService
andDocumentMapper
(e.g.simpleMatchToFullName
), that could be rather accessed through retrievingMappingLookup
. -
MapperService#getObjectMapper
is only used in tests, it can probably be replaced byMappingLookup#getObjectMapper
- Make all
FieldMappers
"obviously" immutable. - Fix caching issues around reloading analyzers (Reloading analyzers doesn't bust the request cache #66722)