Description
We have a number of tokenfilters that we should deprecate in favour of newer functionality, for example keyword_repeat
should be replaced by multiplexer
, shingle
and edgengram
by index_phrases
and index_prefixes
options, etc. Marking these as deprecated is currently made difficult by the way that preconfigured components are built.
Ideally, we should issue deprecation warnings when component factories are created. However, because all preconfigured factories are constructed up-front by the AnalysisRegistry, a deprecation warning on, e.g. keyword_repeat
will be emitted for every new index mapping, whether or not that mapping refers to keyword_repeat
.
We should change AnalysisRegistry to only build component factories when they are explicitly specified in mappings; as well as making for better deprecations, this should also allow us to save some memory by reducing the number of unused factories built per-index.