Skip to content

e2e: Increase all ANR timeouts to 2m to ensure CI reliability. #1733

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 19, 2023

Conversation

marun
Copy link
Contributor

@marun marun commented Jul 19, 2023

Why this should be merged

CI seems to be exceeding many ANR-related timeouts. Rather than bumping timeouts piecemeal, all ANR timeouts are set to the same constant of 2 minutes.

How this works

How this was tested

@marun marun requested review from abi87 and gyuho as code owners July 19, 2023 05:47
Copy link
Contributor

@abi87 abi87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this!

Comment on lines -170 to -171
// start is async, so wait some time for cluster health
time.Sleep(time.Minute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this sleep just never needed? Like a minute long sleep is long

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it was needed in the past, but it doesn't seem to be needed now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not neede anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to be clear, this sleep ensured each e2e job was wasting 30-40s. At least with a timeout the check can complete as soon as the nodes are ready, but this sleep will never exit early if the nodes are healthy earlier than expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing piece here is understanding that Health actually blocks until Healthy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup that was very old ANR tech dep. with the first grpc server implementation, start (with blockchain creation -I know this is not the case but this is a copy probably-) broked posterior health call without sleeping some time.

@StephenButtolph StephenButtolph merged commit 79e59d2 into dev Jul 19, 2023
@StephenButtolph StephenButtolph deleted the e2e-increase-timeouts branch July 19, 2023 17:36
@StephenButtolph StephenButtolph added this to the v1.10.5 milestone Jul 19, 2023
@StephenButtolph StephenButtolph added ci This focuses on changes to the CI process testing This primarily focuses on testing labels Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci This focuses on changes to the CI process testing This primarily focuses on testing
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants