Skip to content

Conversation

@lukeelmers
Copy link
Member

@lukeelmers lukeelmers commented Apr 14, 2020

related #61768, #61769

Summary

First pass at creating an agg type function. I went with terms because it is one of the more complex ones. Once we decide we are happy with this approach, we'll create functions for all other aggs, and then refactor esaggs to use them.

The goal here is to make sure we are comfortable with the general approach, before we repeat it for a bunch of aggs 😄

I tried to implement this in a way that makes it as easy as possible to "scale" and add new functions, since we will need to add a bunch of them (one for each agg type). At this point adding a new function looks like this:

  1. Create an interface for the params of the agg type you are writing a function for
  2. Add the interface to AggParamsMapping in types.ts
  3. Write the function, using the interface you wrote in step 1
  4. Write tests for the function
  5. Import your function in agg_types.ts and add it to getAggTypesFunctions
  6. Rinse and repeat for each of the ~30ish aggs 💦

cc @ppisljar @alexwizp

@lukeelmers lukeelmers requested a review from ppisljar April 14, 2020 22:54
@lukeelmers lukeelmers force-pushed the feat/agg-types-function branch from b028380 to f6e1723 Compare April 15, 2020 21:45
@lukeelmers lukeelmers assigned alexwizp and lukeelmers and unassigned alexwizp Apr 15, 2020
@lukeelmers lukeelmers requested a review from alexwizp April 15, 2020 21:46
@lukeelmers lukeelmers added Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:ExpressionLanguage Interpreter expression language (aka canvas pipeline) Team:AppArch labels Apr 15, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@lukeelmers lukeelmers added release_note:skip Skip the PR/issue when compiling release notes review v7.8.0 v8.0.0 labels Apr 15, 2020
Comment on lines 77 to 80
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the shape of the agg_type which every agg type function returns. That way we can enforce that esaggs only accept agg_type as an argument, instead of a stringified JSON blob.

Comment on lines 83 to 85
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generic helps to streamline the process of typing arguments for an expression function. It uses the provided Name to retrieve the params for the agg from the AggParamsMapping below. Then it combines those with id, enabled, and schema from the agg config options -- these arguments will be required for each of the agg type expression functions, however the params will be different between each.

Comment on lines +94 to +96
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add each new agg type here, along with the params for that agg. This way you can access all the agg params from this mapping.

Eventually we can make this interface public, and folks could even add to it using declare module to append their agg to the interface. (This would be necessary if, for example, there were a special type of aggregation which was only provided by x-pack).

@lukeelmers
Copy link
Member Author

@elasticmachine merge upstream

Copy link
Contributor

@ppisljar ppisljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need a way to get from aggConfig to expression function. once we have expression builder ready:

const as = agg.toJSON();
createExpression().addFunction(as.type, as.params)

can we assume that serialized params are exact params to the expression function ?
with terms the problem is with orderAgg parameter, which in expression function should be a subexpression. if serialized params and expression function params are different we will need to extend agg type to return expressionparams :|

how do you feel about adding .toAST() function to aggConfig ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you tried to do it in a way like we do for aggTypes. But maybe we can create a factory method (or you can use some other pattern) to encapsulate forEach methods and make the setup hook easier for reading. It will be awesome if you will do it for aggTypes and aggFuncitons

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexwizp Just to make sure I follow; how would you envision this looking? Just removing the forEach from search_service entirely & passing expressions down so it is handled inside of the function instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I ask is because readability is what I was aiming for here, but of course I'm always open to ideas to make things more readable. Rather than having the magic of stuff getting registered in another function, I liked the idea of seeing very clearly what this service was registering during setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a start I removed the bucket/metric differentiation per Peter's suggestion; lmk what your thoughts are and I'm happy to adjust further

Copy link
Contributor

@alexwizp alexwizp Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear enough why you created that interface in buckets/terms.ts not in buckets/terms_fn.ts
From my point of view we can simplify your approach and remove AggExpressionFunctionArgs and AggParamsMapping types at all. I don't see any benefits for today.

what about to do all buckets/terms_fn.ts:

....
import { AggExpressionType, AggConfigJson, IAggExpressionFunction } from '../';

const aggTypeName = 'terms';
const fnName = 'aggTypeTerms';

type Input = any;
type Output = AggExpressionType;

export interface AggParamsTerms extends IAggExpressionFunction {
  field: string;
  order: 'asc' | 'desc';
  orderBy: string;
  orderAgg?: AggConfigJson;
  size?: number;
  missingBucket?: boolean;
  missingBucketLabel?: string;
  otherBucket?: boolean;
  otherBucketLabel?: string;
  // advanced
  exclude?: string;
  include?: string;
}

type FunctionDefinition = ExpressionFunctionDefinition<
  typeof fnName,
  Input,
  AggParamsTerms,
  Output
>;

export const aggTypeTerms = (): FunctionDefinition => ({
....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the reason I put this with terms.ts & created the AggParamsMapping was because I felt that having typings for each of the agg type params would be generally useful outside of the expression functions & didn't have anything to do with expressions in particular.

For example, say I want to create a new agg config. Without the mapping, I could do:

const configState = {
  type: 'terms',
  enabled: true,
  params: {
    field: 'machine.os.keyword',
    // ...etc
  }
};
data.search.aggs.createAggConfigs(indexPattern, [configState]);

But configState is only enforced to match the general shape of params (currently pretty useless: Record<string, any>). What would be ideal would be to have the ability to pull in an interface for the params you are using so that you can have more type safety in your params. Perhaps by making the serialized agg config interface a generic:

const configState: AggConfigSerialized<AggParamsTerms> = {
  type: 'terms',
  enabled: true,
  params: {
    field: 'machine.os.keyword',
    // ...now this would be more strictly typed
  }
};

Or even better, using the type from the global AggParamsMapping based on the type name string.

More concretely, as we have an agg types registry which technically could have anything registered to it, having a mapping of these interfaces that others can add to via declare module will be necessary eventually anyway.

@lukeelmers
Copy link
Member Author

lukeelmers commented Apr 20, 2020

@ppisljar

we also need a way to get from aggConfig to expression function. once we have expression builder ready

Oops, totally forgot about this, thanks for the reminder.

how do you feel about adding .toAST() function to aggConfig ?

IMHO we should start with a separate little utility function that takes in the aggConfig & returns the expression AST. This would be easy to remove later when the expression builder is ready, at which point folks can just use toJSON() (or serialized()). Adding toAST to aggconfigs feels like mixing concerns a little more than we probably need to, but I'm not strongly opposed to it if you think it's the right path.

can we assume that serialized params are exact params to the expression function ?

That is my assumption. The expression args are serialized params (which differ on a per-agg basis), and enabled, id, + schema. As you pointed out, I think it'd be problematic/confusing if there was a divergence between expression args & params. This is why I'd like to eventually change how we are typing the agg params so that they are using the same interface as the expression fn is using for the args. (Edit: Also a param of type agg gets serialized by calling toJSON in the param type)

with terms the problem is with orderAgg parameter, which in expression function should be a subexpression.

I'm looking into this, but even if it were necessary i think it would be okay, since the subexpression could just be another agg type expression function with the serialized info for that agg.

@lukeelmers lukeelmers force-pushed the feat/agg-types-function branch from 37b7af7 to 09d97e5 Compare April 21, 2020 04:52
@lukeelmers
Copy link
Member Author

IMHO we should start with a separate little utility function that takes in the aggConfig & returns the expression AST.

Thinking about this some more, perhaps having a toExpressionAst() is the right approach here after all. I started writing a conversion function, only to realize that there's no elegant way to determine the name of the expression function from the AggConfig itself without making a bunch of assumptions about how those functions are named.

Since technically these could be registered from anywhere, it makes sense that the author of the agg type should indicate the name of the corresponding expression function.

So for now I'm thinking we do something similar to what @ppisljar was suggesting -- except we add an expressionName which each agg type explicitly defines. Then a toExpressionAst() method that will put together an AST based on that name. So in AggConfig this.type.expressionName is the name of the corresponding expression function.

We could even consider having the toExpressionAst in each individual agg type, though this would result in a lot of duplicated code so a top-level method on AggConfig is probably a simpler choice.

@lukeelmers lukeelmers force-pushed the feat/agg-types-function branch from 09d97e5 to a0ac52f Compare April 21, 2020 22:00
Copy link
Member Author

@lukeelmers lukeelmers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ppisljar @alexwizp Updated with toExpressionAst and added some notes for us to discuss.

Copy link
Member Author

@lukeelmers lukeelmers Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every agg type will now need to include this field so we know their corresponding function name (currently optional, but we can make it required once all functions have been created)

@lukeelmers lukeelmers force-pushed the feat/agg-types-function branch from 6296718 to 0588f1e Compare April 22, 2020 23:48
Copy link
Contributor

@ppisljar ppisljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM, left two small comments. once we have AST builder ready this can be cleaned up (should probably the first case for using AST builder).

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@lukeelmers lukeelmers merged commit e2baff3 into elastic:master Apr 23, 2020
lukeelmers added a commit to lukeelmers/kibana that referenced this pull request Apr 23, 2020
…#63541)

* Minor cleanup on AggConfigJson interface.

* Add types for terms agg params & mapping.

* Add terms agg expression function.

* Register agg functions with expressions service.

* Add unit tests for terms expression function.

* Update API docs.

* AggConfigJson -> AggConfigSerialized

* Add serialize(), enforce serializable params, fix subexpressions in terms agg fn.

* Simplify getAggTypesFunctions

* it() -> test()

* Add help text.

* Ensure serialize() is used by agg param type.

* Add toExpressionAst to AggConfig.

* Add json arg to terms agg fn.

* Update docs.

* Fix typo which caused functional test failures.

* Add AggParam.toExpressionAst so AggConfig.toExpressionAst can return function ast instead of expression ast.

* Clean up overlooked items.
@lukeelmers lukeelmers deleted the feat/agg-types-function branch April 24, 2020 03:50
gmmorris added a commit to gmmorris/kibana that referenced this pull request Apr 24, 2020
* master: (70 commits)
  KQL removes leading zero and breaks query (elastic#62748)
  [FieldFormats] Cleanup: rename IFieldFormatType -> FieldFormatInstanceType (elastic#64193)
  [ML] Changes transforms wizard UI text (elastic#64150)
  [Alerting] change server log action type .log to .server-log in README (elastic#64124)
  [Metrics UI] Design Refresh: Inventory View, Episode 1 (elastic#64026)
  chore(NA): reduce siem bundle size using babel-plugin-transfor… (elastic#63269)
  chore(NA): use core-js instead of babel-polyfill on canvas sha… (elastic#63486)
  skip flaky suite (elastic#61173)
  skip flaky suite (elastic#62497)
  Renamed ilm policy for event log so it is not prefixed with dot (elastic#64262)
  [eslint] no_restricted_paths config cleanup (elastic#63741)
  Add Oil Rig Icon from @elastic/maki (elastic#64364)
  [Maps] Migrate Maps embeddables to NP (elastic#63976)
  [Ingest] Data streams list page (elastic#64134)
  chore(NA): add file-loader into jest moduleNameMapper (elastic#64330)
  [DOCS] Added images to automating report generation (elastic#64333)
  [SIEM][CASE] Api Integration Tests: Configuration (elastic#63948)
  Expose ability to check if API Keys are enabled (elastic#63454)
  [DOCS] Fixes formatting in alerting doc (elastic#64338)
  [data.search.aggs]: Create agg types function for terms agg. (elastic#63541)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:ExpressionLanguage Interpreter expression language (aka canvas pipeline) release_note:skip Skip the PR/issue when compiling release notes review v7.8.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants