Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add optional timeouts to waiting for counters/gauges #12251

Merged
merged 14 commits into from
Jul 30, 2020

Conversation

samflattery
Copy link
Contributor

Signed-off-by: Sam Flattery samflattery@google.com

Commit Message: Add timeouts to waiting for counters/gauges
Additional Description:

  • change waitForCounterEq/Ge to use the same functions as waitForGaugeEq/Ge
  • add a default timeout of 10s
  • raise an exception if they timeout

Risk Level: Low
Testing: passes xDS fuzzer tests which make heavy use of these
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Sam Flattery <samflattery@google.com>
@samflattery
Copy link
Contributor Author

/cc @asraa @htuch

Copy link
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think moving to an AssertionResult will end up better than throwing an exception just for test clarity.

}

void waitForCounterGe(const std::string& name, uint64_t value) override {
notifyingStatsAllocator().waitForCounterFromStringGe(name, value);
TestUtility::waitForCounterGe(statStore(), name, value, time_system_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere here we should warn about the default timeout in the comments, because it's opaque that it might throw an exception.

I think it might make more sense to return an AssertionResult instead of throwing, but calls to these waitFor's will need to be wrapped with ASSERT_TRUE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Might be better to split this up into two functions, say waitForCounterGe and waitForCounterGeTimeout since there are dozens of calls to these across all the different tests. The fuzz test is probably the only one that would expect these functions to timeout, say if there is a bug in xDS and the update doesn't properly go through, since mostly they're used to wait for the server to catch up to updates in integration tests. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that without ABSL_MUST_USE_RESULT you can ignore the AssertionResult.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right sure, I thought you meant any calls to waitFor would need to be wrapped with ASSERT_TRUE, but I can just do it in the fuzzer. I'll make that change, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: I just moved the assert true to the server stub instead so that all calls will use it

test/test_common/utility.h Show resolved Hide resolved
@asraa asraa self-assigned this Jul 23, 2020
Sam Flattery added 2 commits July 23, 2020 17:19
Signed-off-by: Sam Flattery <samflattery@google.com>
Signed-off-by: Sam Flattery <samflattery@google.com>
test/test_common/utility.cc Outdated Show resolved Hide resolved

/**
* Wait for a gauge to >= a given value.
* @param store supplies the stats store.
* @param name gauge name.
* @param value target value.
* @param time_system the time system to use for waiting.
* @param timeout the maximum time to wait before timing out.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to prevent this from using real time? For tests I think we can only safely do this with simulated time. Tests already have a bounded lifetime, so we can rely on that for the real time counter wait timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for doing this was that the real timeout is something like 5 minutes so debugging is kind of difficult when you have to wait so long to see the logs, etc. I can check to see if I can just condition on whether time_system_ is simulated time or not, but not sure how useful a timeout in simulated time would be. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per offline discussion sam's changing the timeout to an optional, so users will fail on timeout only when a timeout param is passed

Signed-off-by: Sam Flattery <samflattery@google.com>
Sam Flattery added 2 commits July 28, 2020 11:50
Signed-off-by: Sam Flattery <samflattery@google.com>
Signed-off-by: Sam Flattery <samflattery@google.com>
@samflattery
Copy link
Contributor Author

samflattery commented Jul 28, 2020

@asraa I made two different implementations (see last two commits), one in which the user has to pass in the timeout every time, and one in which it is a member variable that can be set in the HttpIntegrationTest before the server is started. The second approach is a bit more complex though and involves passing the timeout through several functions before setting it so it's a bit more disruptive to the existing codebase. Happy to go with either, let me know what you think!

@samflattery
Copy link
Contributor Author

Failing extension discovery test which Asra said was flaking

Sam Flattery added 3 commits July 28, 2020 14:54
Signed-off-by: Sam Flattery <samflattery@google.com>
Signed-off-by: Sam Flattery <samflattery@google.com>
test/integration/server_stats.h Show resolved Hide resolved
test/test_common/utility.h Outdated Show resolved Hide resolved
Signed-off-by: Sam Flattery <samflattery@google.com>
asraa
asraa previously approved these changes Jul 28, 2020
Copy link
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@asraa asraa changed the title test: add timeouts to waiting for counters/gauges test: add optional timeouts to waiting for counters/gauges Jul 28, 2020
@samflattery
Copy link
Contributor Author

@asraa Still failing extension discovery after merging master, passes locally on --runs_per_test=100

@asraa
Copy link
Contributor

asraa commented Jul 28, 2020

#12326
I'll re-run

Sam Flattery added 2 commits July 29, 2020 09:56
Signed-off-by: Sam Flattery <samflattery@google.com>
Signed-off-by: Sam Flattery <samflattery@google.com>
@samflattery
Copy link
Contributor Author

Failing flaking extension discovery service test

@asraa
Copy link
Contributor

asraa commented Jul 29, 2020

Needs a merge :(

asraa
asraa previously approved these changes Jul 29, 2020
Copy link
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@asraa asraa merged commit 702f1fb into envoyproxy:master Jul 30, 2020
@samflattery samflattery deleted the wait_timeout branch July 31, 2020 09:51
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this pull request Aug 7, 2020
…y#12251)

Commit Message: Add timeouts to waiting for counters/gauges in integration tests
add a default timeout of 0s which will not enable the timeout for the existing tests
raise an assertionfailure if they timeout

Signed-off-by: Sam Flattery <samflattery@google.com>
chaoqin-li1123 pushed a commit to chaoqin-li1123/envoy that referenced this pull request Aug 7, 2020
…y#12251)

Commit Message: Add timeouts to waiting for counters/gauges in integration tests
add a default timeout of 0s which will not enable the timeout for the existing tests
raise an assertionfailure if they timeout

Signed-off-by: Sam Flattery <samflattery@google.com>
Signed-off-by: chaoqinli <chaoqinli@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants