-
Notifications
You must be signed in to change notification settings - Fork 8.5k
[Metrics UI] Add integration tests for Metric Threshold Rule and refactor to fire correctly #109971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metrics UI] Add integration tests for Metric Threshold Rule and refactor to fire correctly #109971
Conversation
…to fire correctly
|
Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui) |
x-pack/plugins/infra/server/lib/alerting/metric_threshold/lib/evaluate_alert.ts
Outdated
Show resolved
Hide resolved
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: |
jasonrhodes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for writing tests
|
Tested the branch and was able to get metric threshold alerts to trigger for both metric types. |
…ctor to fire correctly (elastic#109971) * [Metrics UI] Add integration tests for Metric Threshold and refactor to fire correctly * Removing unused variables * Fixing tests for metric_threshold_executor * Fixing test for metric_query * fixing test * Changing type guard
…ctor to fire correctly (elastic#109971) * [Metrics UI] Add integration tests for Metric Threshold and refactor to fire correctly * Removing unused variables * Fixing tests for metric_threshold_executor * Fixing test for metric_query * fixing test * Changing type guard
…ctor to fire correctly (#109971) (#111012) * [Metrics UI] Add integration tests for Metric Threshold and refactor to fire correctly * Removing unused variables * Fixing tests for metric_threshold_executor * Fixing test for metric_query * fixing test * Changing type guard Co-authored-by: Chris Cowan <chris@chriscowan.us>
…eporting-to-v2 * 'master' of github.com:elastic/kibana: (65 commits) Move to vis_types folder part 2 (elastic#110574) [SOR] use initialNamespaces when checking for conflict for `create` and `bulkCreate` (elastic#111023) [Discover] Remove export* syntax (elastic#110934) [Event log][7.x] Updated event log client to search across legacy IDs (elastic#109365) [Security Solution][Detection Rules] Changes 'activated' text on rule details page (elastic#111044) [Metrics UI] Filter out APM nodes from the inventory view (elastic#110300) [package testing] Update logging and pid configuration (elastic#111059) [Dashboard] Read App State from URL on Soft Refresh (elastic#109354) Add correct roles to test user for functional tests in dashboard (elastic#110880) [DOCS] Adds Lens Inspector and minor edits (elastic#109736) [DOCS] Updates Spaces page (elastic#111005) normalize initialNamespaces (elastic#110936) [Reporting] Clean up `any` usage, reorganize server route files (elastic#110740) [Security Solution] [CTI] Fixes bug that caused Threshold and Indicator Match rules to ignore custom rule filters if a saved query was used in the rule definition. (elastic#109253) skip flaky suites: elastic#111001, elastic#111022 [Security Solution][RAC] - Update reason field text (elastic#110308) [RAC][Security Solution] Make analyzer work with EuiDataGrid full screen (elastic#110913) [Metrics UI] Add integration tests for Metric Threshold Rule and refactor to fire correctly (elastic#109971) [DOCS] Updates Discover docs (elastic#110346) [RAC] Persistent timeline fields fix (elastic#110685) ...
|
Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync. |
…ctor to fire correctly (#109971) (#111011) * [Metrics UI] Add integration tests for Metric Threshold and refactor to fire correctly * Removing unused variables * Fixing tests for metric_threshold_executor * Fixing test for metric_query * fixing test * Changing type guard Co-authored-by: Chris Cowan <chris@chriscowan.us>
Summary
This PR adds integration tests along with a refactor to the Metric Threshold Alerts so that they fire as expected. Fixes #110912
Data and Expectations
For this PR I've added a new data set under
x-pack/test/functional/es_archives/infra/alerts_test_datathat contains 2 real world examples. The first is agaugestyle example where the data is indexed every 5 minutes for two different environments,devandprod. The values fordevare all set to0and the values forprodrange from0 to 3randomly. The very last event looks like this:{ "@timestamp": "2021-01-01T01:00:00Z", "value": 1, "env": "prod" }The expectation is that if the rule ran at
2021-01-01T01:00:00Zwith the following criteria:It should trigger a
shouldFireevent for theprodenvironment. Here is the same data in a chart with the threshold line.This failed on the current implementation because it was adding a
timeUnitto theendvalue of the time frame, then moving theendto the start of thetimeUnit. Also, thedate_rangebucket was being offset by60swhich also contributed the situation where it would miss the bucket.This PR removed the
endmodification for bothgaugeandcounterstyle queries. Forgaugestyle queries, thedate_rangebucketing has been removed. Thedate_rangebucketing was problematic because it uses the following logic@timestamp >= from && @timestamp < towhich excluded the last event (which is included in therangequery), which we expect to trigger the rule.The implementation in this PR now relies on the
rangein thequeryalong with a basic metric aggregation. Forgaugestyle rules, it will only query the same amount of data that is expressed withtimeUnitandtimeSize; for the example above, that would bethe last 5 minutes.For
counterstyle rules that usesAggregators.RATE, everything should work the same as before and only trigger on the last "full" bucket. The chart below is from the second data set and has been confirmed with an integration test.Checklist
Glossary
gaugestyle metrics are queries that use basic metric aggregations likesum,avg,max,min,percentileto aggregate values across multiple documents. What makes a metric agaugeis that the value is a snapshot in time. For examplesystem.cpu.total.pctis the total CPU usage being used at that point in time.counterstyle metrics are monotonically increasing numbers where the value is a cumulative sum over time. To find the value at a point in time you have to typically use derivative pipeline aggregation or rate (which isderivative(max(value))).