-
Notifications
You must be signed in to change notification settings - Fork 235
Description
As the title says, due to a misconfiguration in my cluster the operator fails during startup (some of my CRDs are not applied / present yet).
This causes an exception to be thrown on the line controllerManager.start(!leaderElectionManager.isLeaderElectionEnabled()); below. In the catch block the stop() method is called, but because started is not yet set to true (line 195) the stop method does nothing.
This causes my JVM to hang since the thread pools of the ExecutorServiceManager are already started (line 189) and they are not daemon threads. I think configurationService.getExecutorServiceManager().stop(reconciliationTerminationTimeout) (line 213) should always be called and be idempotent. This would ensure no dangling threads would remain.
Snippet of the code I'm referring to:
java-operator-sdk/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/Operator.java
Lines 187 to 220 in ff3edb7
| // need to create new thread pools if we're restarting because they've been shut down when we | |
| // previously stopped | |
| configurationService.getExecutorServiceManager().start(configurationService); | |
| // first start the controller manager before leader election, | |
| // the leader election would start subsequently the processor if on | |
| controllerManager.start(!leaderElectionManager.isLeaderElectionEnabled()); | |
| leaderElectionManager.start(); | |
| started = true; | |
| } catch (Exception e) { | |
| stop(); | |
| throw new OperatorException("Error starting operator", e); | |
| } | |
| } | |
| @Override | |
| public void stop() throws OperatorException { | |
| Duration reconciliationTerminationTimeout = | |
| configurationService.reconciliationTerminationTimeout(); | |
| if (!started) { | |
| return; | |
| } | |
| log.info( | |
| "Operator SDK {} is shutting down...", configurationService.getVersion().getSdkVersion()); | |
| controllerManager.stop(); | |
| configurationService.getExecutorServiceManager().stop(reconciliationTerminationTimeout); | |
| leaderElectionManager.stop(); | |
| if (configurationService.closeClientOnStop()) { | |
| getKubernetesClient().close(); | |
| } | |
| started = false; | |
| } |