Skip to content

Conversation

@liuguoqingfz
Copy link
Contributor

Description

Allocation is concurrent and order-dependent. Sometimes test2/test3 fill up node capacity (the 6-shards-per-node cap) before all three test1 primaries get a slot. Then one test1 primary stays unassigned too, and you see 16 (or even 14) assigned instead of 17.

Related Issues

Resolves #19726

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@liuguoqingfz liuguoqingfz requested a review from a team as a code owner October 24, 2025 14:01
@github-actions github-actions bot added >test-failure Test failure from CI, local build, etc. autocut flaky-test Random test failure that succeeds on second run labels Oct 24, 2025
…uguoqingfz@gmail.com>

Fixed a flaky test that is order dependent            Signed-off-by: Joe Liu <liuguoqingfz@gmail.com>

Signed-off-by: Joe Liu <guoqing4@illinois.edu>
@github-actions
Copy link
Contributor

✅ Gradle check result for c964095: SUCCESS

@codecov
Copy link

codecov bot commented Oct 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.17%. Comparing base (0c89456) to head (c964095).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19762      +/-   ##
============================================
- Coverage     73.19%   73.17%   -0.03%     
+ Complexity    70946    70924      -22     
============================================
  Files          5735     5735              
  Lines        324654   324654              
  Branches      46962    46962              
============================================
- Hits         237643   237556      -87     
- Misses        67875    67906      +31     
- Partials      19136    19192      +56     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines +259 to +275
// Ensure test1 primaries are placed before adding other indices (prevents starvation)
assertBusy(() -> {
ClusterState s = client().admin().cluster().prepareState().get().getState();
int primariesStarted = 0, unassigned = 0;
for (IndexRoutingTable irt : s.getRoutingTable()) {
if (irt.getIndex().getName().equals("test1")) {
for (IndexShardRoutingTable isrt : irt) {
for (ShardRouting sr : isrt) {
if (sr.primary() && sr.started()) primariesStarted++;
if (sr.unassigned()) unassigned++;
}
}
}
}
assertEquals(3, primariesStarted); // 3 primaries started
assertEquals(3, unassigned); // 3 unassigned (the replicas)
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace this with a call to the ensureYellow("test1") helper method in the parent test class? The index should be red until all primaries are assigned, and will be yellow if replicas are unassigned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autocut flaky-test Random test failure that succeeds on second run >test-failure Test failure from CI, local build, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AUTOCUT] Gradle Check Flaky Test Report for ShardsLimitAllocationDeciderIT

2 participants