Open
Description
In #5603 the bot is trying to upgrade a dependency of the example. We don't actually know if that upgrade is safe or not, it may also require argument changes.
The main issue is that this docker-compose/monitor/docker-compose.yml
file is not being testing in the CI. SPM is a major functionality, it's good to have some basic smoke test that it's working, as well as to test that the corresponding docker compose file is valid. An example of such e2e test is scripts/build-all-in-one-image.sh
where we validate that the resulting image is correctly serving the web UI.
Proposal for integration test:
- perform a fresh build of Jaeger code (don't load published images), to test the current branch
- bring up services via docker compose
- wait for all services to respond successfully to health checks
- for each service in microsim
- loop until at least 3 data non-zero data points are retrieved from the REST API
- or until the overall timeout (say 10min) is reached, in which case fail the test
- we need to test that the response also contains the right labels, specifically to catch this issue: Restore "operation" name in the metrics response #5673
Additional improvements:
- The prometheus config is set to scrape metrics every 15sec, but in the UI the data points come with granularity of 1min. That means the test will have to run for at least 3min before succeeding, an artificial delay. We should find a way to use shorter interval (it may be already supported by the query API)
- Check not just
calls
metric, but alsoerrors
, at least for services that may have errors- This might require improvement to microsim to generate errors. Add ability to simulate errors yurishkuro/microsim#16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment