Skip to content

Add support for runtime fields #59332

Closed
Closed
@javanna

Description

@javanna

Runtime fields

We would like to increase the flexibility of the search API by introducing support for runtime fields.

Runtime fields are not indexed and do not have doc_values, meaning Lucene is completely unaware of them, but they are consumed through the field capabilities API and the search API like any ordinary field. It is possible to retrieve them as well as query them, aggregate and sort on them.

Runtime fields make searches slower, as computing their values for each document (that may match the query) is costly, depending on how they are calculated; it is highly recommended that the Async Search API is used to run searches that use runtime fields.

One limitation of runtime fields compared to ordinary fields is that they don’t support scoring as they are not indexed and we are not going to compute the document frequency for them, which is required for scoring.

Runtime fields are not part of the _source, hence they are not returned by default as part of the search hits. They can be specifically requested through the field retrieval API (#55363).

A runtime field is defined by its data type and the script that computes its values. As of today, each search section already supports scripting, but the contexts are different, as well as the required syntax. We want to unify this to a single place where a script can be specified. Such a script always has access to _source, any other stored fields as well as doc_values.

A runtime fields can be defined in the mappings by adding its definition to a new runtime section at the same level as properties, where fields that exist in _source are defined:

PUT /my-index/_mappings
{
    "runtime" : {
        "day_of_week" : {
            "type" : "keyword",
            "script" : {
                "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
            }
        }
    }
}

The data types supported for runtime fields are initially keyword, long, double, date, ip, boolean, geo_point. In the example above, we extract the day of week (e.g. Monday) from another field called timestamp which is defined as a date. The script can refer to other fields, including other runtime fields: we need to implement a mechanism to resolve fields dependencies in the correct order, and prevent cyclic dependencies.

The defined field can then be used like any other field in the different sections of the search API:

GET my_index/_search
{
    "aggs" : {
        "days_of_week" : {
            "terms" : {
                "field" : "day_of_week"
            }
        }
    }
}

Each runtime field type will consist of a MappedFieldType that exposes a runtime fielddata implementation that generates doc_values on the fly for the needed data type. Additionally, all the basic Lucene queries for each runtime field type need to be written to query the corresponding fielddata/doc_values implementation.

Support for runtime fields in Elasticsearch will be released under the Elastic license.

The following is a high-level list of tasks required to develop the initial support for runtime fields, which will lay the foundations for the next phases:

Mappers and field types

API

Scripting

Infrastructure

Security

Telemetry

Docs

  • Document runtime section and corresponding field types ([DOCS] Add docs for runtime fields #62653)
    • inconsistencies caused by updating a script while queries that rely on it are running
    • existing queries / visualizations may break because runtime fields can be updated
    • queries against runtime fields are deemed expensive and rejected when expensive queries are disallowed
  • Document how to define runtime fields in a search request
  • Document the ability to omit the script from the definition of a runtime field
  • Document the ability to shadow existing fields with a runtime field
  • Document dynamic runtime mode ([DOCS] Add dynamic runtime fields to docs (#66194) #66304)

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions