Skip to content

Moved rollup metadata from cluster state inside the datastream metadata #68465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

csoulios
Copy link
Contributor

@csoulios csoulios commented Feb 3, 2021

Moved rollup metadata so that it is stored inside the datastream metadata and removed them from the custom metadata in the cluster state.

Also modified the structure of the RollupGroup objects to include metrics

(This implementation currently fails when serializing the datastream
metadata)
(This implementation still fails when serializing the datastream
metadata)
@csoulios csoulios added >non-issue :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data v8.0.0 labels Feb 3, 2021
@csoulios csoulios requested a review from talevy February 3, 2021 15:32
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 3, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

group.add(rollupIndexName, rollupInfo);
rollupGroups.put(rollupGroupKeyName, group);
}
dsMetadata.put(RollupMetadata.TYPE, new RollupMetadata(rollupGroups));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this metadata is intended for metadata defined in composable templates (#63991), I think we should have RollupMetadata be its own field within DataStream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. We also discussed that this will probably solve the problem we faced when serializing the RollupMetadata object when added in the _meta map.

However, I am still not sure if it's ok to make the datastream aware of rollups.
Any feedback from the @elastic/es-core-features team would be welcome

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On another note, adding the RollupMetadata instance inside the DataStream object makes it almost impossible to guard against the RollupsV2 feature flag

Copy link
Contributor

@talevy talevy Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. the one upside is that it will never be non-null when rollupV2 is not used. So we do not have to worry about existing usage of it since this is internal and we have the assignment of this instance variable under the RollupV2 feature flag. I feel like it would be OK to set and maybe leave a comment about its usage.

Copy link
Contributor Author

@csoulios csoulios Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed implementation of this approach in a separate branch. If we all agree that's a better approach, I will merge it here.

You can view it here:
https://github.com/csoulios/elasticsearch/compare/rollup-metadata-ds..csoulios:rollup-metadata-ds-own-field?diff=split

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this approach that you've laid out in rollup-metadata-ds-own-field. I would just say that NEW_FEATURES_VERSION should not be used to version-guard the new field, but overall, yes I think it is good. Maybe the DataStream team has differing opinion, not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some context to the DataStream team: We only plan on having Rollup Search Resolution happen when users query datastreams. Wherein underlying rollup indices will take precedence over their respective original indices, where the rollup index can answer the aggregation query. When users query rollup indices outside the scope of datastreams, it behaves just like a regular index so no new metadata associated is needed. This is coupled with DataStream.

@dakrone dakrone self-requested a review February 4, 2021 14:56
@dakrone
Copy link
Member

dakrone commented Feb 4, 2021

Okay, I looked at this PR and I also talked to @danhermann (who is more of a data streams expert than I am). I've come to the conclusion, however, that I don't think we should do this.

I think data stream metadata is the wrong place for this rollup abstraction. Data stream metadata is intended to hold things specific only to a data stream (like the generation, and whether the data stream is hidden). We don't buy ourselves any benefit from tying the resolution metadata into the data stream.

The downside of this PR is that we tie an abstraction into a place where it shouldn't be (in my opinion).

We only plan on having Rollup Search Resolution happen when users query datastreams.

I don't see the technical reason why this limitation is required (which it sounds like it would be if we continue down the path of storing it in the data stream cluster state). There are a lot of valid use cases for rollups that occur outside of data streams (such as with aliases). By tying the resolution only to data streams we prevent users from using rollups outside of the data stream use case (non-timeseries data, timeseries with minor changes that must use aliases, etc).

When users query rollup indices outside the scope of datastreams, it behaves just like a regular index so no new metadata associated is needed. This is coupled with DataStream.

Another downside of the proposed behavior is migration, if a user has a source index and manually rolls it up into a rollup index, then puts those two indices into an alias and migrates the alias into a data stream, the metadata will be missing and the correct resolution will not be performed. I think this will lead to a lot of confusion for users where their rollups don't work unless they have a specific origination (since the metadata is only set in the transport action and then not checked at a later time).

Additionally, restoring backing indices from a snapshot will behave differently, for example, if you restored only a subset of backing indices, then the metadata for the data stream wouldn't be restored and the rollup resolution connection would not be made.

A better place for this resolution metadata would be on the index metadata itself. The rolled up index should have all the metadata available within itself so that it can know what index it can potentially replace (a pointer to an index by uuid for example) as well as its supported aggregations. By keeping the metadata on the index itself, it allows us to deal with these indices at a much more granular level. For example, a user could restore a single non-rolled-up index and its rollup index from a snapshot and still have the rollup resolution work, because the rollup index would contain the necessary information in its metadata to perform the can-match phase accordingly. We would no longer need to deal with the special case where the rolled up index and the rollup metadata are restored (or not restored) separately.

Regardless of whether this gets changed to store the metadata in the index metadata (I know it is currently in the top level cluster metadata), I think the data stream metadata is definitely the wrong place for it.

@dakrone dakrone added the :Data Management/Data streams Data streams and their lifecycles label Feb 4, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Feb 4, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@csoulios
Copy link
Contributor Author

@dakrone thanks a lot for the feedback. I understand your concerns about tying the data stream abstraction with rollups and I agree with most of your points. Also, I would like to explain our way of thinking for this PR and clarify some important points about the new (v2) rollups implementation and how rollups operate on indices and data streams.

By tying the resolution only to data streams we prevent users from using rollups outside of the data stream use case (non-timeseries data, timeseries with minor changes that must use aliases, etc).

Rollups will work for all indices that contain a timestamp. Users will be able to rollup indices that are not members of a data stream in the same way as with indices that belong to a data stream. The only difference will be in the way the results of querying those indices will be returned.

In the legacy (v1) rollups implementation when querying against the _rollup_search endpoint, results from overlapping buckets (returned from rollup and non-rollup indices) were automatically merged so that we did not include duplicates in the results. To do so, rollup indices had a "magic" structure with conventions over field names and users were discouraged from querying them directly through the _search endpoint.

To simplify the user experience we have chosen to expose the rollup indices as "normal" indices that just happen to have
a different structure. This means that when the rollup and non-rollup indices participate in a query, all results will be returned including duplicates. No magic happens here and users don't wonder why some indices were ignored when querying different index patterns. If users want to query rollups they should target only the rollup indices.

We only plan on having Rollup Search Resolution happen when users query datastreams.

On the other hand, a data stream is the perfect abstraction for hiding the underlying indices and hence implement all the magic of merging results. So, when users query data streams, non-rollup indices will be ignored in favour of their rollups. This will make rollups behave like a caching layer in the data stream.

To find the rollup indices of a non-rollup index we need to store this association in a global state. The data stream cluster state looked like a good candidate for this structure, but you are right in that the dataStream.getMetadata() is not a good place to store this association.

Another downside of the proposed behavior is migration, if a user has a source index and manually rolls it up into a rollup index, then puts those two indices into an alias and migrates the alias into a data stream, the metadata will be missing and the correct resolution will not be performed.

This is a very valid point that I had missed.

A better place for this resolution metadata would be on the index metadata itself. .....

This is a good place to keep all metadata (rollup configuration) about the rollup index. However, if we want to find all rollup indices for a specific index, we will have to read the metadata for all indices which is very expensive. This is why we chose to store this association in the global state.

We have come up with a new approach that may eliminate the need for a global state, which I am currently working on. I will get back with more on this.

Thanks again for your feedback. Any further insight on the above points is very welcome.

@csoulios
Copy link
Contributor Author

After discussing with the team, we have decided to move rollup metadata to the rollup index metadata.
Abandoning this approach and will continue the implementation in PR #67783

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >non-issue :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Data Management Meta label for data/management team v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants