Skip to content

flake: TestWorkspaceDeletionLeak #3520

@ntnn

Description

@ntnn

Describe the bug

The test TestWorkspaceDeletionLeak has been flaking:

Image

I was aware that this could happen when I implemented the test and had planned for a workaround, but that would require the maintainers of goleak to merge an open PR, details are here: #3491 (comment)

Instead the test now uses require.EventuallyWithT (kcptestinghelpers.Eventually would always immediately fail for some reason) but it seems the 30s are not enough:

 I0807 09:17:51.683088   39314 namespace_controller.go:194] "Namespace has been deleted" component="kcp" postStartHook="kcp-start-controllers" namespace="yef8oaknnwv5ohao|default"
{"level":"warn","ts":"2025-08-07T09:18:05.175445Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/tmp/TestWorkspaceDeletionLeak3304689176/002/artifacts/etcd-server/member/wal/0000000000000000-0000000000000000.wal","error":"fileutil: file already locked"}
    leak_test.go:99: found leaking goroutines: ...
    leak_test.go:99: 
        	Error Trace:	/home/prow/go/src/github.com/kcp-dev/kcp/test/integration/workspace/leak_test.go:99
        	Error:      	Condition never satisfied
        	Test:       	TestWorkspaceDeletionLeak
        	Messages:   	eventually there will be no random goroutines running while checking for leaks
I0807 09:18:20.286940   39314 dynamic_serving_content.go:195] "Failed to remove file watch,

It's also not possible to just shut down the KCP server because that could hide potential leaks.

Just ignoring any goroutines that have to do with http requests also has the potential to hide leaks, e.g. if an uncontexted http request is sent that runs for a long time.

Steps To Reproduce

  1. Make a PR
  2. Wait for the test to fail randomly
  3. If it doesn't retrigger until it does: https://prow.kcp.k8c.io/?job=pull-kcp-test-integration

Expected Behaviour

The test should not flake

Additional Context

No response

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.kind/flakeCategorizes issue or PR as related to a flaky test.

Type

No type

Projects

Status

New

Relationships

None yet

Development

No branches or pull requests

Issue actions