-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Fix AggregationFactories.Builder equality and hash regarding order #34005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AggregationFactories.Builder equality and hash regarding order #34005
Conversation
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments but +1 for this approach.
@@ -79,12 +79,12 @@ public String getName() { | |||
public abstract AggregationBuilder subAggregation(PipelineAggregationBuilder aggregation); | |||
|
|||
/** Return the configured set of subaggregations **/ | |||
public List<AggregationBuilder> getSubAggregations() { | |||
public Set<AggregationBuilder> getSubAggregations() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not Collection ?
@@ -68,8 +68,8 @@ public String getName() { | |||
* Internal: Validates the state of this factory (makes sure the factory is properly | |||
* configured) | |||
*/ | |||
protected abstract void validate(AggregatorFactory<?> parent, List<AggregationBuilder> factories, | |||
List<PipelineAggregationBuilder> pipelineAggregatorFactories); | |||
protected abstract void validate(AggregatorFactory<?> parent, Set<AggregationBuilder> aggregationBuilders2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, Collection
?
@@ -0,0 +1,162 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have BaseAggregationTestCase
for aggregators so we could maybe add the testing for this case directly there. It would test all implementations rather than just a few.
Thanks @jimczi! I will work on the details once we agreed on the approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a small comment but I'm also +1 for this approach as it conveys the fact that the order is not set if you get the builders from the AggregatorFactories.Builder
but are set when we create the AggregatorFactories
object in build()
} | ||
return new AggregatorFactories(parent, aggFactories, orderedpipelineAggregators); | ||
} | ||
|
||
private List<PipelineAggregationBuilder> resolvePipelineAggregatorOrder( | ||
List<PipelineAggregationBuilder> pipelineAggregatorBuilders, List<AggregationBuilder> aggBuilders, | ||
Set<PipelineAggregationBuilder> pipelineAggregatorBuilders2, Set<AggregationBuilder> aggregationBuilders2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems 2
has slipped into the naming here can we revert to the original names?
17ce46d
to
9135fe8
Compare
Addressed the review comments. @jimczi @colings86 The remaining test failure is another ordering problem: Prior to that change ordering was clearly defined by the order of aggregations in the query. I changed the IT to access by key rather than by position. For the doc/yaml build however: I do not know how to do this and wonder if this PR is good or not. There are potentially more tests that purely work on text matching. Order is not defined in JSON, still users might rely on order. This change makes the order somewhat unpredictable for a user. I think it would be nice to bring back the original behavior or always order in a defined way (lexical order). Thoughts? Comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good, thanks for making the changes @hendrikmuhs . I don't think we need AggregatorFactoriesBuilderTests
though. Instead we could change BaseAggregationTestCase
to explicitly test serialization/equals/hashCode with more than one builder ?
I don't think we should do anything here. The order will change if you have a proxy or a client that processes the json differently so we should not force anything. If we have tests that rely on ordering we should change them but I am 99% sure ;) that the doc/yaml tests don't care about the order. |
de6261b
to
d658931
Compare
I take my 1% chance. ;-) There were 2 ordering problems in the doc tests, 1 was fixable the other one not. But, after changing to a |
This does not tackle the problem. The The problem found is only in I noted that the title is misleading, I will fix that. |
I think it's fine to use a
The fact that we use an array to represent them internally should not change the reasoning. The order of sub-aggregations in the request should not matter.
This pr fixes a real issue, today different orderings of the same sub-aggregations are cached individually. So this query for instance:
... which yields the same response than the first example is considered as a different query by our cache. IMO this is a bug in the |
I now better understand what you mean and I can try to improve test coverage of The non-inherited |
Mainly for testing purposes.
9cebc2e
to
6ab6555
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @hendrikmuhs !
* master: Use more precise does S3 bucket exist method (elastic#34123) LLREST: Introduce a strict mode (elastic#33708) [CCR] Adjust list retryable errors (elastic#33985) Fix AggregationFactories.Builder equality and hash regarding order (elastic#34005) MINOR: Remove some deadcode in NodeEnv and Related (elastic#34133) Rest-Api-Spec: Correct spelling in filter_path description (elastic#33154) Core: Don't rely on java time for epoch seconds formatting (elastic#34086) Retry errors when fetching follower global checkpoint. (elastic#34019) Watcher: Reenable watcher stats REST tests (elastic#34107) Remove special-casing of Synonym filters in AnalysisRegistry (elastic#34034) Rename CCR APIs (elastic#34027) Fixed CCR stats api serialization issues and (elastic#33983) Support 'string'-style queries on metadata fields when reasonable. (elastic#34089) Logging: Drop Settings from security logger get calls (elastic#33940) SQL: Internal refactoring of operators as functions (elastic#34097)
…34005) Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization and deserialization. This ensures storing configs with aggregation works properly. This also addresses a potential issue in caching when the same query contains aggregations but in different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in the cache.
re-enables tests after upstream fixes, see #34005
…34005) Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization and deserialization. This ensures storing configs with aggregation works properly. This also addresses a potential issue in caching when the same query contains aggregations but in different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in the cache.
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization and deserialization.
Note:
At time of writing the impact of this issue mainly affects tests (-> non-issue), however in future work I am storing aggregation configuration as part of job configurations and therefore require this fix.
It also fixes a potential issue in caching when the same query contains aggregations in different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in the cache.
Previous fix: #33942
switches to a LinkedHashSet instead of a List.
CC @jimczi @cbuescher @colings86