-
Notifications
You must be signed in to change notification settings - Fork 565
Fix Condition check for Reconcile InstallPlan status #3153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/e2e/subscription_e2e_test.go
Outdated
// The Reason check: | ||
// CatalogSources[Added|Deleted] could overwrite AllCatalogSourcesHealthy, but leave Status/Message preserved | ||
// Added/Deleted could subesequently be overwritten by Healthy | ||
// This all depends on when reconciliation occurs and if the number of catalog sources changed between reconcilliations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This all depends on when reconciliation occurs and if the number of catalog sources changed between reconcilliations | |
// This all depends on when reconciliation occurs and if the number of catalog sources changed between reconciliations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh... no spell checker in my code editor!
if lastCond.Status != cond.Status && lastCond.Reason != cond.Reason && lastCond.Message == cond.Message { | ||
GinkgoT().Logf("waited %s subscription condition met %v\n", time.Since(lastTime), cond) | ||
lastTime = time.Now() | ||
lastCond = cond | ||
} | ||
return true | ||
} | ||
|
||
fmt.Printf("subscription condition not met: %v\n", cond) | ||
if lastCond.Status != cond.Status && lastCond.Reason != cond.Reason && lastCond.Message == cond.Message { | ||
GinkgoT().Logf("waited %s subscription condition not met: %v\n", time.Since(lastTime), cond) | ||
lastTime = time.Now() | ||
lastCond = cond | ||
} | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love if there were a way to write this for greater readability.
Maybe a comment something like
// if status/reason/message meet expectations, then subscription state is considered met/true IFF this is the result of a recent change of status/reason
// else, cache the current status/reason for next loop/comparison
I also have a concern that this makes this test edge-set instead of level-set. I'm not sure how purist we need to be for an e2e test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the general trend to reduce repetitive logging, and I admit, the comparisons are ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BUT... that's not the reason for the change... The change reduces the logging when conditions don't change, the reason for getting out of the loop doesn't change, just the output.
test/e2e/subscription_e2e_test.go
Outdated
require.FailNowf(GinkgoT(), "failed", "subscription contains unexpected installplan condition: %v", cond) | ||
case operatorsv1alpha1.SubscriptionInstallPlanMissing: | ||
require.Equal(GinkgoT(), operatorsv1alpha1.ReferencedInstallPlanNotFound, cond.Reason) | ||
case operatorsv1alpha1.SubscriptionCatalogSourcesUnhealthy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this test is written makes it sensitive to any change in the cluster state or the reconciliation flow of the controller. Since we're trying to test a negative ("all of the non-InstallPlan
conditions can change between the time we looked at them in the beginning of the test and now"), I don't know that this test is super powerful, and, since that kind of approach is very fragile, I wonder if a different approach exists (maybe just to delete the test? what is it testing?).
Consider the case where a CatalogSource happens to blip unhealthy for a moment during this check. Why should the test fail if the reconciler has correctly exposed that downtime in status?
test/e2e/subscription_e2e_test.go
Outdated
|
||
By(`Ensure original non-InstallPlan status conditions remain after InstallPlan transitions`) | ||
hashEqual := comparison.NewHashEqualitor() | ||
By(`Ensure original, specific, non-InstallPlan status conditions remain after InstallPlan transitions`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By(`Ensure original, specific, non-InstallPlan status conditions remain after InstallPlan transitions`) | |
By(`Ensure InstallPlan-related status conditions match what we're expecting`) |
Fix operator-framework#3151 Remove non-InstallPlan related checks for this test. Also: * Clean up some looping log messages * Clean up some logging added when comments were converted These comments/logs are at the beginning of the test, and are also part of the test sequence, so they are redundant (and possibly confusing) Signed-off-by: Todd Short <todd.short@me.com>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: stevekuznetsov, tmshort The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The test failed in that e2e run @tmshort
|
That should be including the (new) finalizer code in the merge. AFAICT, it's failing on the delete of the CSV (not on the removal of the CRBs/CRs/SA), as in the original failure. The merge queue was also failing on unit test, so something was wonky. |
Fix #3151
The Reason check for CatalogSourcesUnhealthy does not take into account that CatalogSourcesAdded may overwrite the AllCatalogSourcesHealthy reason because the number of catalog sources changed. So, depending on when reconcilliation happens relative to the catalog sources being added, the initial Reason may be either Added or Healthy.
At the end of the test, also depending on when reconcilliation happens, the new Reason may be Added or Healthy.
New can't be Added after the original is Healthy because no additional catalog sources are added once the test starts.
However, the Status and Message (which is the real indicator of health) must match. And because of the possible update, don't check the timestamps.
Also:
These comments/logs are at the beginning of the test, and are also part of the test sequence, so they are redundant (and possibly confusing)
Description of the change:
Motivation for the change:
Architectural changes:
Testing remarks:
Reviewer Checklist
/doc
[FLAKE]
are truly flaky and have an issue