Skip to content

Stop watches on controller stop #2983

Closed
@basti1302

Description

@basti1302

When creating a controller with controller.Start(ctx); and later stopping it by cancelling the provided context, watches defined on the controller will continue to trigger their event handler. Since the watch is owned by the controller, I would expect all watches defined on the controller to be terminated once the controller is terminated.

A reproduction repository is here: https://github.com/dash0hq/controller-runtime-reproducer/tree/main

This can be reproduced with this test script contained in the repository, which makes sure to continuously create events that will trigger the watch (if it is active).

This produces the following output:

2024-10-14T14:48:39Z    INFO    setup    successfully created a new watch
2024-10-14T14:48:39Z    INFO    setup    starting manager
2024-10-14T14:48:39Z    INFO    Starting EventSource    {"controller": "example_controller", "source": "kind source: *v1.Pod"}
2024-10-14T14:48:39Z    INFO    Starting Controller    {"controller": "example_controller"}
2024-10-14T14:48:39Z    INFO    starting server    {"name": "health probe", "addr": ":8081"}
2024-10-14T14:48:39Z    INFO    controller-runtime.metrics    Starting metrics server
2024-10-14T14:48:39Z    INFO    controller-runtime.metrics    Serving metrics server    {"bindAddress": ":8080", "secure": false}
2024-10-14T14:48:39Z    INFO    received create event
...
2024-10-14T14:48:39Z    INFO    received create event
2024-10-14T14:48:39Z    INFO    Starting workers    {"controller": "example_controller", "worker count": 1}
2024-10-14T14:48:39Z    INFO    received update event
2024-10-14T14:48:49Z    INFO    received update event
2024-10-14T14:48:53Z    INFO    received update event
2024-10-14T14:49:09Z    INFO    setup    stopping controller/cancelling controller context
2024-10-14T14:49:09Z    INFO    setup    controller context has been cancelled
2024-10-14T14:49:09Z    INFO    Shutdown signal received, waiting for all workers to finish    {"controller": "example_controller"}
2024-10-14T14:49:09Z    INFO    All workers finished    {"controller": "example_controller"}
2024-10-14T14:49:09Z    INFO    setup    controller has been stopped
2024-10-14T14:49:23Z    INFO    received update event
2024-10-14T14:49:24Z    INFO    received update event
2024-10-14T14:49:24Z    INFO    received update event
2024-10-14T14:49:24Z    INFO    received delete event
2024-10-14T14:49:29Z    INFO    received create event
2024-10-14T14:49:29Z    INFO    received update event
2024-10-14T14:49:29Z    INFO    received update event
...

As you can see, the event handler receives updates after the controller has been stopped.


In case you are curious about the wider context: My actual use case for this is stopping/removing a watch dynamically. I want to handle create/update events for third party resource types (monitoring.coreos.com.PrometheusRule for example). I do not know in advance whether the third party CRD is deployed or not. Thus I have a reconciler watching apiextensionsv1.CustomResourceDefinition with a filter predicate. If the CRD in question is created, I start a new controller/reconciler watching that resource type. If the CRD is deleted later, I really would like to stop or remove the watch. Otherwise an error message is emitted to the logs every couple of seconds: "Unhandled Error" err="pkg/mod/k8s.io/client-go@v0.31.1/tools/cache/reflector.go:243: Failed to watch monitoring.coreos.com/v1, Kind=PrometheusRule: the server could not find the requested resource" logger="UnhandledError". This apparently happens within controller-runtime.

So far I have not found any way to stop or remove a watch.

These previous issues seem to be related, but it does not seem that any of them ever resulted in something that allows stopping watches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions