Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e error: already exists in namespace '' and cannot be managed by operator-controller #1307

Open
Tracked by #950
joelanford opened this issue Sep 24, 2024 · 6 comments · May be fixed by #1428
Open
Tracked by #950

e2e error: already exists in namespace '' and cannot be managed by operator-controller #1307

joelanford opened this issue Sep 24, 2024 · 6 comments · May be fixed by #1428
Assignees

Comments

@joelanford
Copy link
Member

See https://github.com/operator-framework/operator-controller/actions/runs/11019902090/job/30603456402#step:4:417

Other runs with this error:

=== RUN   TestClusterExtensionInstallReResolvesWhenManagedContentChanged
    cluster_extension_install_test.go:746: When a cluster extension is installed from a catalog
    cluster_extension_install_test.go:747: It resolves again when managed content is changed
    cluster_extension_install_test.go:769: It installs the specified package with correct bundle path
    cluster_extension_install_test.go:770: By creating the ClusterExtension resource
    cluster_extension_install_test.go:773: By reporting a successful installation
    cluster_extension_install_test.go:774: 
        	Error Trace:	/home/runner/work/operator-controller/operator-controller/test/e2e/cluster_extension_install_test.go:780
        	            				/opt/hostedtoolcache/go/1.22.5/x64/src/runtime/asm_amd64.s:1695
        	Error:      	Not equal: 
        	            	expected: "True"
        	            	actual  : "False"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,2 +1,2 @@
        	            	-(v1.ConditionStatus) (len=4) "True"
        	            	+(v1.ConditionStatus) (len=5) "False"
        	            	 
    cluster_extension_install_test.go:774: 
        	Error Trace:	/home/runner/work/operator-controller/operator-controller/test/e2e/cluster_extension_install_test.go:781
        	            				/opt/hostedtoolcache/go/1.22.5/x64/src/runtime/asm_amd64.s:1695
        	Error:      	Not equal: 
        	            	expected: "Succeeded"
        	            	actual  : "Failed"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-Succeeded
        	            	+Failed
    cluster_extension_install_test.go:774: 
        	Error Trace:	/home/runner/work/operator-controller/operator-controller/test/e2e/cluster_extension_install_test.go:782
        	            				/opt/hostedtoolcache/go/1.22.5/x64/src/runtime/asm_amd64.s:1695
        	Error:      	"CustomResourceDefinition 'probes.monitoring.coreos.com' already exists in namespace '' and cannot be managed by operator-controller" does not contain "Installed bundle"
    cluster_extension_install_test.go:774: 
        	Error Trace:	/home/runner/work/operator-controller/operator-controller/test/e2e/cluster_extension_install_test.go:774
        	Error:      	Condition never satisfied
        	Test:       	TestClusterExtensionInstallReResolvesWhenManagedContentChanged
--- FAIL: TestClusterExtensionInstallReResolvesWhenManagedContentChanged (63.29s)
@tmshort
Copy link
Contributor

tmshort commented Sep 24, 2024

This is inconsistent, when it shows up, a re-run of the e2e usually succeeds. This is a fairly frequent flake.

@joelanford
Copy link
Member Author

Sounds like the reason for this flake is that we reuse bundle objects in different tests and do not properly wait for cleanup to occur.

Short term fix:

  • Wait for bundle objects to be fully deleted in the test cleanup

Long term fix:

  • Automatically generate random test fixtures and run tests in parallel. This setup resolves the conflict problem and helps exercise the controller under more typical load where multiple ClusterExtensions and ClusterCatalogs may exist at once.

@LalatenduMohanty LalatenduMohanty self-assigned this Oct 1, 2024
@LalatenduMohanty LalatenduMohanty removed their assignment Oct 8, 2024
@rashmi43
Copy link
Contributor

rashmi43 commented Oct 8, 2024

hi @joelanford for the short term fix, should I add a grace period?

@rashmi43
Copy link
Contributor

rashmi43 commented Oct 9, 2024

/assign rashmi43

@m1kola
Copy link
Member

m1kola commented Nov 6, 2024

I think this is related #1354

@tmshort
Copy link
Contributor

tmshort commented Nov 6, 2024

#1354 seems to be WIP. I'm taking a look at this as well.

@tmshort tmshort self-assigned this Nov 6, 2024
tmshort added a commit to tmshort/operator-controller that referenced this issue Nov 6, 2024
Fixes operator-framework#1307

Create and use a new namespace for every e2e test. This means that
extension resources are placed in their own namespace. The tests
deletes the namespace, and then waits until completion. This ensures
that _most_ of an extension's resources are deleted. It does not
guarantee that global resources (e.g. CRDs, CRs, CRBs) are deleted,
but it improves the tests, and would eventually allow the tests
to be run in parallel (assuming the installed extensions allow
for that).

Signed-off-by: Todd Short <tshort@redhat.com>
@tmshort tmshort linked a pull request Nov 6, 2024 that will close this issue
4 tasks
tmshort added a commit to tmshort/operator-controller that referenced this issue Nov 6, 2024
Fixes operator-framework#1307

Create and use a new namespace for every e2e test. This means that
extension resources are placed in their own namespace. The tests
deletes the namespace, and then waits until completion. This ensures
that _most_ of an extension's resources are deleted. It does not
guarantee that global resources (e.g. CRDs, CRs, CRBs) are deleted,
but it improves the tests, and would eventually allow the tests
to be run in parallel (assuming the installed extensions allow
for that).

Signed-off-by: Todd Short <tshort@redhat.com>
tmshort added a commit to tmshort/operator-controller that referenced this issue Nov 6, 2024
Fixes operator-framework#1307

Create and use a new namespace for every e2e test. This means that
extension resources are placed in their own namespace. The tests
deletes the namespace, and then waits until completion. This ensures
that _most_ of an extension's resources are deleted. It does not
guarantee that global resources (e.g. CRDs, CRs, CRBs) are deleted,
but it improves the tests, and would eventually allow the tests
to be run in parallel (assuming the installed extensions allow
for that).

Signed-off-by: Todd Short <tshort@redhat.com>
tmshort added a commit to tmshort/operator-controller that referenced this issue Nov 6, 2024
Fixes operator-framework#1307

Create and use a new namespace for every e2e test. This means that
extension resources are placed in their own namespace. The tests
deletes the namespace, and then waits until completion. This ensures
that _most_ of an extension's resources are deleted.

It does guarantee that global resources (e.g. CRDs, CRs, CRBs) are
deleted.

And would eventually allow the tests to be run in parallel (assuming
the installed extensions allow for that).

Signed-off-by: Todd Short <tshort@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants