Description
Relates to #42949
ValuesSourceConfig
has a concept of "unmapped", which the user has specified a field name, but we are unable to resolve that to a mapped field (Note that this is different from the user not specifying a field at all, in which case a script is required or the query is invalid). This might happen, for example, if a user were running an aggregation over multiple indexes, and one didn't have the field. If the user supplied a missing value, VSConfig returns a ValuesSource
which returns that missing value and the aggregation proceeds as normal. Otherwise, the aggregation is given an opportunity to optimize for the unmapped case via the createUnmapped
method. See TermsAggregatorFactory
for an example of where this is used.
There are a couple of problems to fix here. First and foremost, ValuesSourceConfig
communicates the unmapped state back to the aggregation builder by returning a null
values source. This ends up with a lot of null checking down stream, and the associated cognitive load and potential for missing a spot and triggering an NPE. Instead, VSConfig could expose an isUnmapped
predicate, and always return a defined values source (We have static empty implementations that would be ideal here). The type for this values source would be determined by either the user's valueType hint or the default values source type, if the user didn't send a hint. This is the same way we decide the type for an unmapped field with a missing value now.
Additionally, aggregations that don't specialize for the unmapped case (e.g. Histogram) currently just hard code an aggregator to run. More logically, these should still respect the ValuesSourceType
as resolved in ValuesSourceConfig
. In many cases, createUnmapped
could just call doCreateInternal
with a ValuesSourceType
and an empty (but not null) ValuesSource