-
Couldn't load subscription status.
- Fork 130
Description
Context
A data stream is an index abstraction very similar to an index alias. Under the hood, it points to multiple backing indices and allows searches using a single named resource. The underlying backing indices of a data stream are automatically created and follow this naming convention: .ds-<data-stream-name>-<generation>
For example, a data stream named logs-redis will have backing indices named .ds-logs-redis-000001, .ds-logs-redis-000002, and so on.
The creation of a data stream requires a matching index template containing the mappings and settings used to configure the data stream's backing indices. The data_stream field indicates that the template creates a data stream instead of a regular index.
PUT /_index_template/my-data-stream-template
{
"index_patterns": [ "logs-haproxy", "logs-nginx", "logs-redis" ],
"data_stream": { }
}
Note that the index_patterns in the template matches the data stream name, not the underlying backing indices.
Is your feature request related to a problem? Please describe.
We have identified some limitations/enhancements when using the Index Management plugin with data streams.
-
Unable to associate an ISM policy with a data stream
To automatically associate an ISM policy with newly created indices, users can define index patterns within the policy itself. While it is possible to define an index pattern that matches the name of the underlying backing indices of a data stream (eg..ds-logs-redis-*), it is not yet possible to define an index pattern that matches the name of the data stream directly (eg.logs-redis). This differs from how index patterns are matched in the case of an index template.Suggestion:
When a newly created index belongs to a data stream, the ISM policy of the highest priority having an index pattern matching the parent data stream name should be associated with the index. This will make the index pattern matching behavior consistent with index templates. -
Rollover alias should not be needed for a data stream
ISM supports the rollover action but is primarily built for index aliases as the rollover target. For this reason, ISM expects therollover_aliasindex setting to be defined in the index template. This isn't necessary for data streams because:- Data streams do not use an index alias, so the name
rollover_aliasis confusing. - ISM should be able to identify the data stream name (i.e. the rollover target) from the backing index itself.
- Multiple data streams can be created from a single index template, but the
rollover_aliassetting in that template can only have one rollover target.
Suggestion:
When executing the rollover action on a backing index of a data stream, ISM should use the data stream name as the rollover target. Users will no longer have to define therollover_aliassetting for a data stream. - Data streams do not use an index alias, so the name
-
Rollup jobs do not resolve the backing indices of a data stream
A rollup job allows users to define index expressions as the source. These index expressions can resolve index names, index patterns, and index aliases to the list of corresponding concrete indices. The same resolution doesn't happen for data streams.Suggestion:
Index expression resolution should resolve a data stream name to its corresponding backing indices. This will make the resolution behavior consistent with index aliases.
Additional context
This change builds on top of the following feature to support data streams in OpenSearch:
opensearch-project/OpenSearch#675