increase watcher stop test max to 60s #45679
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As of #43939 Watcher tests now correctly block until all Watch executions
kicked off by that test are finished. Prior we allowed tests to finish with
outstanding watch executions. It was known that this would increase the
time needed to finish a test. However, running the tests on CI can be slow
and on at least 1 occasion it took 60s to actually finish.
This PR simply increases the max allowable timeout for Watcher tests
to clean up after themselves.
Example of a log that illustrates that the time out needs to be increase:
Tests (with 30s max):
Server
Since these are run on the same machine, it is OK to compare timestamps with precision.
The test logs above says:
The test started at
23:53,392
and finished at24:26,457
.Since the timeout in this example is 30s, and the time the test took is 30+s we can start to assume a timeout.
"waiting until stopping state reached stopped state" -> this says that we told Watcher to shut down, but it never fully reached a shutdown state in the allowed time (30s in this case)
"There are still tasks running after this test" -> is an another indication that test timed out with outstanding work to be done.
Looking at the server logs:
We start to look just after
23:53,392
, and see that the Watch itself executed VERY fast23:53,489
->23:53,493
So the test framework executed the Watch and started the shutdown process at
23:53,524
(which now ensure all outstanding work is done before finishing the shutdown)We can also see that the Watcher did not fully stop until
24:53,018
, so this test on this machine at this moment in time, took just shy of 60s from starting to shut down till it actually shuts down (on my local machine it takes milliseconds)"watch history could not be written" and "triggered watches could not be deleted" are artifacts of the test framework cleaning up, but the lingering task continues to live in the back ground. With #43939 this should only happen if the stop timeout is exceeded.
"finished [test_watch]" -> is an indication the next test running before the prior test has actually finished. Again with #43939 this should only happen if the stop timeout is exceeded.