Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marks local-cluster tests with #[serial] #3466

Merged
merged 1 commit into from
Nov 4, 2024

Conversation

brooksprumo
Copy link

Problem

local-cluster tests timeout a lot; one is test_wait_for_max_stake(). All the local-cluster tests are marked with #[serial] to ensure they don't interfere with each other, but test_wait_for_max_stake() is not. We also see test failures in test_boot_from_local_state, which are due to timeouts too. Since #[serial] is opt-in, if we do not add the annotation, the test can run concurrently with other tests.

Summary of Changes

Mark all the local-cluster tests with #[serial].

@steviez
Copy link

steviez commented Nov 4, 2024

@bw-solana - FYI, I know you've spent some time looking at test_wait_for_max_stake as well.
@brooksprumo - Some PR's where we've been looking at this test: #3295, #3430

@brooksprumo
Copy link
Author

brooksprumo commented Nov 4, 2024

@steviez - Thanks for the links! Yeah, originally I was looking at this because my test, test_boot_from_local_state, times out sometimes too. I have code that bounds the timeouts to 30 seconds, so we don't wait forever, but it is still something that should never happen (we should only be waiting for about 10 slots, which should happen much faster than 30 seconds). I noticed that we don't have #[serial] on all the local-cluster tests, so there's possibility for issues. Low hanging fruit to ensure tests are not stepping on each other.

@brooksprumo brooksprumo marked this pull request as ready for review November 4, 2024 20:18
Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can monitor CI time to see if we need more shards

Copy link

@steviez steviez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better reliability / reduced flakiness would be a win and can always look at how to speed things back up after the fact

@brooksprumo brooksprumo merged commit 2c3d436 into anza-xyz:master Nov 4, 2024
40 checks passed
@brooksprumo brooksprumo deleted the serial-local-cluster branch November 4, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants