Ideas around mappings improvements

This meta issue tracks some of the improvements and cleanups that we identified while working on the runtime fields project (#59332):

- [x] Remove the last usage of `SearchContext#mapperService`: it is used for nested objects
- [x] Remove `QueryShardContext#getObjectMapper` : it is mostly used for nested objects and it indirectly exposes retrieving `MappedFieldType` going through `FieldMapper`s, which are not aware of runtime fields
- [x] Clean up `SearchLookup` handling in `SearchExecutionContext`: a separate lookup object needs to be created for the fetch phase which is passed through to `MappedFieldFype#valueFetcher`, yet the (wrong) lookup can also be retrieved directly from the `SearchExecutionContext` 
- [ ] Revise `NestedScope` on `SearchExecutionContext`
- [ ] Clean up special handling for `allowUnmappedFields` and `mapUnmappedAsText` in `SearchExecutionContext`, they can be changed from consumers which seems dangerous and are only needed for the percolator.
- [ ] Review null checks on what `SearchExecutionContext#getFieldType` returns. They most likely need to be converted to use `QueryShardCOntext#isFieldMapped` instead which does not take the `allowUnmappedField` and `mapUnmappedAsString` flags into account
- [ ] Review calls to `FieldsVisitor#postProcess` and make sure that they are all necessary, in some cases we may be able to skip the call as it does nothing depending on which fields we are processing
- [ ] Currently, when an index is created without providing mappings it will have a null document mapper until mappings are provided for it, or a document has indexed in it. 
`MapperService` exposes the `DocumentMapper` through which the mappings can be accessed. The fact that `DocumentMapper` can be null leads to null checks in many places and potential NPEs in callers. We'd like to eagerly create a document mapper for a new index, as soon as it gets created. This has the implication that we would lose the semantics around the fact that when a document mapper is null for an index, it means the index is empty.
That would also allow us to apply index sorting when the index is created instead of delaying it to the document mapper creation.
- [ ] We have a special case to handle the case of an empty doc being indexed, which triggers a dynamic mapping update although the mappings are empty, to ensure that whenever a doc is registered in an index, it has mappings and its corresponding document mapper will not be null then. This special case can be removed if we make sure that every index always has mappings
- [x] Should we move `MapperRegistry` to `index.mapper` package? (#69805)
- [x] `Mapping` is visible from outside of its package but all of its methods are called from within the same package. Review usages and possibly adjust methods visibility accordingly. Also, its instance fields are all package private and accessed directly without going through getters, yet the root object mapper has also a public getter that is only used within the same package. (#69803)
- [ ] Make `ContentPath` less confusing
- [ ] Rework `ObjectMapper` and `RootObjectMapper` to not rely on `clone`
- [ ] Make MappedFieldType responsible for building their own field caps objects, and add in parent/nested information
- [ ] Reduce the number of `merge` methods on `MapperService`, and clarify when they should be called
- [ ] Merging does not require the entire `MapperService`: ideally all of the merge methods exposed by `MapperService` can be factored out to a lightweight `MappingMerger` component (merge itself was already removed from `DocumentMapper` and replaced by calling `Mapping#merge` directly). This means that it will no longer be required to build a full `MapperService` in the master node to execute merges when the indices are not allocated to it. 
- [x] Rework `NestedDocuments` to depend on `MappingLookup` #66877 (@nik9000) 
- [ ] can we remove get field mappings API? or shall we refactor it to make some sense? see also https://github.com/elastic/elasticsearch/issues/5368 
- [ ] Can we remove getMapper and fieldMappers from MappingLookup? Why would one have to go through mappers by name? Can we consolidate how to access fields in the mappings? What needs access to mappers and objectmappers? Can we have a unified lookup structure?
- [ ] Do we even need a DocumentMapper given that MappingLookup became almost a copy of it? We should probably use MappingLookup whenever possible and re-evaluate what we need DocumentMapper for.\
- [ ] Rewrite tests (e.g. MapperServiceTestCase) to rely less, or not at all, on DocumentMapper. Many tests build a MapperService only to use a small portion of it, or merge mappings, or retrieve document mapper from it.


## Caching cleanups
- [ ] Remove `MapperService` from `SearchExecutionContext` entirely. We'd prefer there not be an easy way for the context to get a new snapshot of the mapping. OTOH we don't really want to merge all of the immutable mapping stuff into `MappingLookup`. We have to decide *how* to remove it.
- [x] Remove `SearchContext`'s reference to `MapperService`. This includes exposing `NestedDocuments` in `SearchExecutionContext` so that `DefaultSearchContext` takes it from there. And cleaning up access to `indexService.mapperService()` in `DefaultSearchContext`
- [x] Replace or remove the parsing function passed to `MappingLookup`, which hides a reference to `DocumentMapper`.
- [ ] Pipe all volatile read methods on `MapperService` through `#mappingLookup()` to make sure we perform a single volatile read. Or we could remove them entirely in favor the caller going through `mappingLookup()`. If the caller keeps the `mappingLookup` for the duration of the operation then the caller will get a consistent snapshot of the mapping during the entire process. This can potentially become a big change but it can be done in steps, for instance we can start from duplicated methods that are exposed in `MapperService` and `DocumentMapper` (e.g. `simpleMatchToFullName`), that could be rather accessed through retrieving `MappingLookup`.
- [x] `MapperService#getObjectMapper` is only used in tests, it can probably be replaced by `MappingLookup#getObjectMapper`
- [ ] Make all `FieldMappers` "obviously" immutable.
- [ ] Fix caching issues around reloading analyzers (#66722)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ideas around mappings improvements #64663

Caching cleanups

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ideas around mappings improvements #64663

Description

Caching cleanups

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions