Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document aggregation code generation #121644

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

idegtiarenko
Copy link
Contributor

@idegtiarenko idegtiarenko commented Feb 4, 2025

Update code generation documentation with Aggs method signatures expected to be seen/used by the generated code

@idegtiarenko idegtiarenko added >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Feb 4, 2025
@idegtiarenko idegtiarenko requested a review from nik9000 February 4, 2025 10:26
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some remarks that could be worth adding. Maybe.
It looks good for now, thanks for improving this 🙏

* As a result we could not rely on interfaces implementation and generics.
* </p>
* <p>
* In order to implement aggregation logic create your class (typically named "${FunctionName}${Type}Aggregator").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aggregator name format must be followed for it to work.
The AggregateMapper discovers it based on the name, with reflection, given the Function class name + the type:

private static String determineAggName(Class<?> clazz, String type, String extra, boolean grouping) {
return "org.elasticsearch.compute.aggregation."
+ (clazz.getSimpleName().startsWith("Spatial") ? "spatial." : "")
+ clazz.getSimpleName()
+ type
+ extra
+ (grouping ? "Grouping" : "")
+ "AggregatorFunction";
}

So maybe we should rephrase this a bit? If the name has something unexpected, the AggregateMapper must be changed accordingly. Or maybe this isn't that important here anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet more stuff not to like about AggregateMapper....

Copy link
Contributor

@ivancea ivancea Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thinking about this, and some sync with Ievgen, I'll try to remove AggMapper now. I think we can safely move the reflection to a Supplier method without "much" change. Let's see!

* when both input type I and mutable accumulator state SS and GS are primitive (DOUBLE, INT).
* </li>
* <li>TBD explain {@code IntermediateState}</li>
* <li>TBD explain special internal state `seen`</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the warnings feature, there's also the "failed" state. Identical to seen. Never used in main though, only here: https://github.com/elastic/elasticsearch/pull/116170/files#diff-8a408014887a6dc87eed1f71346536fc77636245d5411714b6ba2cf265812538R18

Maybe worth mentioning, with the warnExceptions attribute

* Grouping aggregation expects:
* <ul>
* <li>type GS (a mutable state used to accumulate result of the grouping aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.GroupingAggregatorState}</li>
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean:

Suggested change
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
* <li>type T (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>

From the following comments.

@alex-spies alex-spies self-requested a review February 5, 2025 14:09
@idegtiarenko idegtiarenko marked this pull request as ready for review February 6, 2025 14:44
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants