Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Observability solution] [SLO] Run burn rate api tests in serverless & ess using mocha tagging #183113

Closed

Conversation

mgiota
Copy link
Contributor

@mgiota mgiota commented May 10, 2024

Addresses #179549
Relates to #166755
🆕 PR #187924

Summary

This POC is the outcome of this R&D issue for having deployment agnostic tests for SLO and o11y alerting features. We based our work on the Mocha tagging approach of the Security Solution and applied it to the slo burn rate rule type. We plan to migrate all our existing API tests listed in the R&D issue in follow up PRs.

The idea is that a) we are going to write our api integration tests in a new location observability_solution_api_integration shared by b) ess & serverless configurations. Each configuration will read from a specific location and load the corresponding services. Tests will be written only once and should be tagged with labels depending on which environments they need to be run.

describe('@ess @serverless SLO burn rate rule, () => {
  describe('Create rule', () => { 
  });

 describe('@skipInServerless missing something', () => { 
 });
}

Description

  • This PR follows the second option defined in this document, the Mocha tagging. We decide through following labels in which environment the tests are going to be executed:
    • @ess: Runs in an ESS environment (on-prem installation) as part of the CI validation on PRs.
    • @serverless: Runs in the first quality gate and in the periodic pipeline.
    • @skipInEss: Skipped for ESS environment.
    • @skipInServerless: Skipped for all quality gates and periodic pipeline.
  • It introduces a new folder x-pack/test/observability_solution_api_integration which will serve as a centralized location for all tests by obs-ux-management team that must be run in Serverless and ESS environments. A list of all tests can be found in the R&D issue
  • Within this folder, there is a "config" subdirectory that stores base configurations specific to both the Serverless and ESS environments. These configurations build upon the base configuration provided by test_serverless and api_integration, incorporating additional settings such as environment variables and tagging options.
  • The file x-pack/test/observability_solution_api_integration/test_suites/alerting/burn_rate/burn_rate_rule.ts is functional in both Serverless and ESS
  • It removes the existing burn rate rule from x-pack/test_serverless/api_integration/test_suites/observability/alerting/burn_rate/burn_rate_rule.ts
  • The alertingApi and sloApi services are imported from test_serverless

In the screenshot below you can see the test_suites folder structure, after having migrated the current alerting and slo features. We recommend having an alerting and slo subfolders. Rest observability apps could be added as another subfolder under test_suites. As part of this PR, the alerting > burn_rate subfolders are created.

Screenshot 2024-05-13 at 09 21 28

How to run locally

You can navigate into the new observability_solution_api_integration folder and use following commands to run the tests in serverless and ess environments accordingly. You can find more information in the README file of the observability_solution_api_integration folder.

cd x-pack/test/observability_solution_api_integration

// SERVERLESS
npm run alerting_burn_rate:server:serverless
npm run alerting_burn_rate:runner:serverless

// ESS
npm run alerting_burn_rate:server:ess
npm run alerting_burn_rate:runner:ess

CI

  • It includes a new entry in the ftr_configs.yml to execute the newly added tests in the pipeline.
  • It involves the addition of mochaOptions in both serverless/config.base.ts and ess/config.base.ts. In the case of serverless, it includes @serverless while excluding @skipInServerless. Similarly, for ess, it includes @ess and excludes @skipInEss.

Quality Gates and periodic pipelines

The first quality gate is the execution of the tests as part of the PR check process. Tests are executed on a mocked serverless enviroment (not MKI). Failures are not blocking a release but are blocking PRs to be merged.

The serverless tests executed as part of the PR check, use a stateless Elasticsearch. The periodic pipeline, which is executed every 4 hours is the health check of our tests in MKI environments.

TODO:

  • summarize the approach we use
  • move alerting api in a common location that can be reused by both environments (currently ess is broken, will be fixed once alerting api is in the common place)
  • remove burn rate rule from the old location (test_serverless)
  • think a bit more about the structure of the newly introduced observability_solution_api_integration folder and how it can be re-used by rest observability apps

@apmmachine
Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mgiota mgiota self-assigned this May 10, 2024
@mgiota mgiota force-pushed the observability_solution_api_integration branch from 265b687 to 0c04a91 Compare May 10, 2024 20:44
@mgiota mgiota force-pushed the observability_solution_api_integration branch 2 times, most recently from a0040d7 to 5f4ec82 Compare May 14, 2024 10:02
);

const svlSharedConfig = await readConfigFile(
require.resolve('../../../../test_serverless/shared/config.base.ts')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I import shared services from test_serverless, since I want to reuse the sloApi and alertingApi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a dangerous approach. Services from the test_serverless area are not guaranteed to work in stateful / ESS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I can understand this. In our case the exact same service works for both ESS and Serveless. What is the recommended approach here? Before doing this, I had the slo_api and alerting_api services within the new folder. Each configuration then (ess & serverless) can decide from where to get the service. Would this be a better approach?

I think we need to see which services can be reused in both environments and clean up/restructure them while we start migrating more tests into this kind of deployment agnostic solution. I would like to further discuss this with you. I'll arrange something

Copy link
Contributor Author

@mgiota mgiota May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pheyos As per our discussion I moved these 2 services under deployment agnostic services in this commit and tests still work fine! Does it look better this way?

@mgiota
Copy link
Contributor Author

mgiota commented May 14, 2024

This PR makes it possible to write deployment agnostic tests using Mocha tagging. I tested it locally by running the scripts I added to the new package.json. Not sure whose area is this (appex-qa or kibana-operations), but my question here is what else needs to be done to make these tests:

  • part of the CI pipeline. I added new configs in the .buildkite/ftr_configs.yml. Is it enough or do I need to make any more changes?
  • part of MKI

@mgiota mgiota marked this pull request as ready for review May 14, 2024 13:04
@dominiqueclarke dominiqueclarke self-requested a review May 14, 2024 18:47
Copy link
Member

@pheyos pheyos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments in the code.

While this approach can work short-term, there are some things to consider.

@ess: Runs in an ESS environment (on-prem installation) as part of the CI validation on PRs.
@serverless: Runs in the first quality gate and in the periodic pipeline.

With this PR you're going your own route when it comes to tagging and configuring serverless tests, which we do not support in the platform tooling. So while your new configs are included in regular local CI runs, you don't get ESS or MKI tests for free. You would have to set up your own pipeline, where you create ESS or MKI projects, point the test runner to it and perform the tests. If you want to run these tests in the Kibana serverless quality gates, you will have to meet some additional criteria for your pipeline before we can include it there.

Another important thing to consider is that teams have to move away from running tests with operator privileges in serverless (details in #183512) in the short term. With this current approach, how would you make sure to do that? As we discussed in zoom, authentication is quite different in stateful and serverless, which is the main reason why separate test directories have been introduced in the first place.

);

const svlSharedConfig = await readConfigFile(
require.resolve('../../../../test_serverless/shared/config.base.ts')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a dangerous approach. Services from the test_serverless area are not guaranteed to work in stateful / ESS.

* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { FtrProviderContext } from '../../test_serverless/api_integration/ftr_provider_context';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the serverless ftr_provider_context here can lead to issues with non-serverless tests (e.g. they might pick the wrong version of an overloaded test service).

Copy link
Contributor Author

@mgiota mgiota May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at this file, where services are loaded from test_serverless, but I totally get the point. That was my concern as well and I was expecting a comment from you regarding this. Let's address this topic in our chat next week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pheyos In this commit I moved my services to the deployment agnostic services. I don't recall what I need to do about the ftr_provider_context file. Could you assist?

@mgiota
Copy link
Contributor Author

mgiota commented May 17, 2024

@elasticmachine merge upstream

@mgiota
Copy link
Contributor Author

mgiota commented May 17, 2024

So while your new configs are included in regular local CI runs, you don't get ESS or MKI tests for free. You would have to set up your own pipeline

@pheyos Thanks a lot for taking the time to review this PR. You are absolutely right, this PR only runs the tests in a local simulated environment. This PR is only the first step. We are aware that extra things need to be set up, so that tests can run in a real MKI environemnt. We would need guidance on how we can setup our own pipeline. Do you have any documentation?

@MadameSheema After merging this initial PR, what were the next steps you did to make your tests run in an MKI environment? Do you have any link to a PR that we could take a look at?

As we discussed in zoom, authentication is quite different in stateful and serverless, which is the main reason why separate test directories have been introduced in the first place.
Regarding authentication, we plan to follow this approach. I am going to try SAML authentication on a POC for functional tests.

Let's further discuss a few things on a Zoom call next week.

@mgiota
Copy link
Contributor Author

mgiota commented May 27, 2024

I'm putting here an EXCELLENT list of PRs @MadameSheema compiled for us! Thanks a tons! Such a great help! This is all the work the Security team has done regarding having tests executed in MKI projects and integrated in buildkite. They have a periodic pipeline that executes all the tests marked as @serverless and they don't have the @skipServerlessInMKI

Here are a few more resources

@mgiota
Copy link
Contributor Author

mgiota commented May 27, 2024

@dominiqueclarke Do you have an idea how we can fix this error Definition for rule '@kbn/eslint/require_mocha_tagging' was not found @kbn/eslint/require_mocha_tagging? It was introduced after this commit

@mgiota mgiota force-pushed the observability_solution_api_integration branch from 449297a to 4d21820 Compare May 27, 2024 12:21
@mgiota mgiota requested a review from pheyos May 27, 2024 13:07
@kibana-ci
Copy link
Collaborator

kibana-ci commented Jun 3, 2024

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #1 / @kbn/eslint/require_mocha_tagging invalid describe('API Integration test', () => {})
  • [job] [logs] Jest Tests #1 / @kbn/eslint/require_mocha_tagging invalid describe('API Integration test', () => {})
  • [job] [logs] FTR Configs #82 / EPM Endpoints installing with hidden datastream should rollover hidden datastreams when failed to update mappings
  • [job] [logs] FTR Configs #82 / EPM Endpoints installing with hidden datastream should rollover hidden datastreams when failed to update mappings

Metrics [docs]

Canvas Sharable Runtime

The Canvas "shareable runtime" is an bundle produced to enable running Canvas workpads outside of Kibana. This bundle is included in third-party webpages that embed canvas and therefor should be as slim as possible.

id before after diff
module count - 5412 +5412
total size - 8.8MB +8.8MB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @mgiota

@elasticmachine
Copy link
Contributor

elasticmachine commented Jul 5, 2024

💔 Build Failed

  • Buildkite Build
  • Commit: 778fd0c
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-183113-778fd0c377a3

Failed CI Steps

Metrics [docs]

Canvas Sharable Runtime

The Canvas "shareable runtime" is an bundle produced to enable running Canvas workpads outside of Kibana. This bundle is included in third-party webpages that embed canvas and therefor should be as slim as possible.

id before after diff
module count - 5412 +5412
total size - 8.8MB +8.8MB

History

cc @mgiota

@mgiota
Copy link
Contributor Author

mgiota commented Jul 9, 2024

Here's an update regarding the deployment agnostic tests and this POC.

FTR tests already make use of the it.tags(['my-tag-1', 'my-tag-2']) pattern, and so we would like to stay with that pattern rather than introducing a new type of mocha tag in the it block's "description". The main trade-off for this is that this tagging is only available at the suite level ("describe") and not individual test level ("it"). However, we consider this not a blocker since tests within one suite are usually all good for the same environment and the few exceptions can easily be handled.

That being said I am going to close this PR and open a new one using suiteTags. Here's the commit where I converted from mochaOps to suiteTags and I verify that all works fine.

Adding following in the config files:

serverless config

suiteTags: {
        include: ['myIncludeServerlessTag'],
        exclude: ['myExcludeServerlessTag'],
      },

ess config

suiteTags: {
        include: ['myIncludeEssTag'],
        exclude: ['myExcludeEssTag'],
      },

and then adding for example this.tags(['myIncludeEssTag', 'myIncludeServerlessTag']) in the test suite instructs the test runner to run the same test suite in both environments.

cc @jasonrhodes @pheyos

@mgiota
Copy link
Contributor Author

mgiota commented Jul 9, 2024

@pheyos Before moving on with the new branch that will contain in its history only the necessary changes (suiteTags and nothing related to mochaOps) I wanted to check what are the conflicts of this branch with main. These conflicts are related to the operator privilege issue. We need to sync up on how to approach this.

@mgiota
Copy link
Contributor Author

mgiota commented Jul 9, 2024

@pheyos Here's what I tried and looks like it works.

The idea is that I have only one sloApi service defined within test/api_integration folder (and not test_serverless), which in turn uses the config service to check if it is a serverless environment and act accordingly, which is to use the svlUserManager service and create an api key role for admin.

As you can see below in case of ess it creates all the predefined roles and users, where as on serverless it only creates an api key.

ess
Screenshot 2024-07-09 at 14 37 54

Screenshot 2024-07-09 at 14 41 37

serverless

Screenshot 2024-07-09 at 14 42 45

What do you think? Would this approach work for you?

@mgiota
Copy link
Contributor Author

mgiota commented Jul 9, 2024

I am closing this POC, in favor of #187924, where I use suiteTags instead of mocha Tagging

@mgiota mgiota closed this Jul 9, 2024
mgiota added a commit that referenced this pull request Aug 21, 2024
Addresses #179549
Relates to #183113 

## Update
Since the Appex QA team has taken on deployment agnostic tests, a lot of
the original implementation of this PR has changed. Now that the Appex
QA team has provided a current directly to write deployment agnostic
tests, the burn rate rule tests have been moved here.

To finish onboarding the burn rate rule test to this new framework, the
following was done.
1. Add an `oblt.stateful.config.ts` file to complement the existing
`oblt.serverless.config.ts` file to ensure the tests are run in CI
2. Ensure our test config is added to the buildkite pipepline
3. Add the alerting service to the new `deployment_agnostic/services`
directory.
4. Port the tests over to the new `deployment_agnostic` directory


To run serverless
```
node scripts/functional_tests_server --config x-pack/test/api_integration/deployment_agnostic/configs/serverless/oblt.serverless.config.ts
node scripts/functional_test_runner --config x-pack/test/api_integration/deployment_agnostic/configs/serverless/oblt.serverless.config.ts --grep="Burn rate rule"
```

To run stateful
```
node scripts/functional_tests_server --config x-pack/test/api_integration/deployment_agnostic/configs/stateful/oblt.stateful.config.ts
node scripts/functional_test_runner --config x-pack/test/api_integration/deployment_agnostic/configs/stateful/oblt.stateful.config.ts --grep="Burn rate rule"
```

For context, I've kept the history from the original PR description
below.

## 🍒 History

A new type of config file will be allowed for API integration and
functional tests within the `x-pack/test` folder, using a pattern of
`*.serverless.config.ts` — these config files will specify configuration
needed to run a set of tests in a serverless deployment context.

FTR tests already make use of the `it.tags(['my-tag-1', 'my-tag-2'])`
pattern, and so we would like to stay with that pattern rather than
introducing a new type of mocha tag in the it block's "description" as
it was introduced in this
[POC](#183113). The difference
with the previous PR in terms of tagging is that we use `suiteTags`
instead of `mochaOps`

Adding following in the config files:

**serverless config**

```
suiteTags: {
        include: ['serverless'],
      },
```

**ess config**

```
suiteTags: {
        include: ['ess'],
      },
```

and then adding `this.tags(['serverless', 'ess'])` in the test suite
instructs the test runner to run the same test suite in both
environments.

In order to keep things simple, we stay with the current skip approach,
which means that flaky tests will be skipped for all environments by
appending .skip() to the suite or to specific test cases.

## Description

- This PR uses `suiteTags` for tagging the tests appropriately. We
decide through following labels in which environment the tests are going
to be executed:
- **@ess**: Runs in an ESS environment (on-prem installation) as part of
the CI validation on PRs.
  - **@serverless**: Runs in a serverless environment.
- It introduces a new folder
`x-pack/test/observability_solution_api_integration` which will serve as
a centralized location for all tests by obs-ux-management team that must
be run in Serverless and ESS environments. A list of all tests can be
found in the R&D
[issue](#179549)
- Within this folder, there is a "**config**" subdirectory that stores
base configurations specific to both the Serverless and ESS
environments. These configurations build upon the base configuration
provided by test_serverless and api_integration, incorporating
additional settings such as environment variables and tagging options.
- The file
`x-pack/test/observability_solution_api_integration/test_suites/alerting/burn_rate/burn_rate_rule.ts`
is functional in both Serverless and ESS
- It removes the existing burn rate rule from
`x-pack/test_serverless/api_integration/test_suites/observability/alerting/burn_rate/burn_rate_rule.ts`
- The `alertingApi` and `sloApi` services are moved to
`test/api_integration` servers

In the screenshot below you can see the `test_suites` folder structure,
after having migrated the current slo burn rate rule. We recommend
having an `alerting` and `slo` subfolders. Rest observability apps could
be added as another subfolder under test_suites. As part of this PR, the
`alerting > burn_rate` subfolders are created.

<img width="376" alt="Screenshot 2024-05-13 at 09 21 28"
src="https://github.com/elastic/kibana/assets/2852703/3ccaf0a5-1443-4bad-ad06-daa347488bf1">

## How to run locally
You can navigate into the new `observability_solution_api_integration`
folder and use following commands to run the tests in serverless and ess
environments accordingly. You can find more information in the README
file of the observability_solution_api_integration folder.

```
cd x-pack/test/observability_solution_api_integration

// SERVERLESS
npm run alerting_burn_rate:server:serverless
npm run alerting_burn_rate:runner:serverless

// ESS
npm run alerting_burn_rate:server:ess
npm run alerting_burn_rate:runner:ess
```

## CI

- It includes a new entry in the `ftr_configs.yml` to execute the newly
added tests in the pipeline.
- It involves the addition of `suiteTags` in both
serverless/config.base.ts and ess/config.base.ts. In the case of
serverless, it includes **@serverless** while excluding
**@skipInServerless**. Similarly, for ess, it includes **@ess** and
excludes **@skipInEss**.

## Quality Gates and MKI pipeline
The Platform team will support config files within `x-pack/test` folder
with a pattern of `*.serverless.config.ts`, so these tests will be
included in Kibana's Quality gates and will be run against a real MKI
environment.

---------

Co-authored-by: Dominique Belcher <dominique.clarke@elastic.co>
Co-authored-by: Dominique Clarke <doclarke71@gmail.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
Co-authored-by: Robert Oskamp <traeluki@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants