chore(ci): Improve reliability of retries in TracingE2ET#2018
Conversation
|
I added another retry loop to a flaky MetricsE2ET which seems to pass reliably now. I will run some more E2E tests sequentially now. The issue was that the List<Double> orderMetrics = RetryUtils.withRetry(() -> {
List<Double> metrics = metricsFetcher.fetchMetrics(invocationResult.getStart(), invocationResult.getEnd(),
60, NAMESPACE, "orders", Collections.singletonMap("Environment", "test"));
if (metrics.get(0) != 2.0) {
throw new DataNotReadyException("Expected 2.0 orders but got " + metrics.get(0));
}
return metrics;
}, "orderMetricsRetry", DataNotReadyException.class).get(); |
|
The last 6 consecutive runs of E2E tests succeeded. It looks like we have resolved all flaky tests with appropriate retry logic for now. |
dreamorosi
left a comment
There was a problem hiding this comment.
Great work with these tests!
Thanks. This is the 3rd time now that I think they are fixed. Let's see if the tests prove me wrong in the next couple of weeks 😁 |
Summary
This PR addresses two reliability issues of the TracingE2E tests which were failing occasionally:
TracingE2Etest to avoid occasional timeouts #1846 (comment)Also adds some more debug logs and more useful logging statements to make debugging easier in the future.
Changes
Issue number: #1846
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.