-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After the failure cluster is recovered, the residual resources are not deleted #3959
Comments
/cc @chaunceyjiang @lxtywypc @zach593 |
how about let execution-controller watch cluster object's unready -> ready? if we are considering enqueue too often, |
After the cluster is recovered, will all resources on the cluster be resynchronized? If so, will the pressure of the controller suddenly surge? What does this strategy specifically refer to? |
So I got a thought for a while, that make objectWatcher use
This makes every affected items ratelimited, clearing items from the ratelimiter may take a lot of time. And if one day we might let user control the options of ratelimiter and number of async worker retries (or just alter to controller-runtime). In that case, recovery mechanism that rely on dynamic parameters is unreliable. On the other hand, there's no much difference between watch cluster objects with async worker/controller-runtime's retry mechanism, they both trigger the execution-controller reconciling. |
/cc @XiShanYongYe-Chang @lxtywypc @RainbowMango A new version is about to be released. If this issue is not resolved, I think we may need to revert #3808. |
@XiShanYongYe-Chang @lxtywypc What's your opinion? |
My thoughts may not be correct, but my viewpoint is that the scope of this issue can be controlled. Can we consider incorporating it into the next version and address the problems mentioned in the current issue? Please correct me if I'm wrong. |
According to the description in #3999, the finalizer of work was not removed correctly, which will cause I am most concerned about the inability to remove the cluster. |
Ok, Thanks |
Hi @RainbowMango @chaunceyjiang @XiShanYongYe-Chang After the discussion among our team, we think that the core issue is that why we need And for #3808, I think it's okay to revert it for the new version coming soon. But the core thought of it we think is still right. We hope it could be brought back when we solve the promblem of |
I'll do it as soon as possible |
I think we could talk about this further more at this time. Firstly we wonder why we need max retry in |
Hi @lxtywypc, I may not be able to explain the reason why IMOP, maybe we don't need the |
Hi @lxtywypc, I have talked to @RainbowMango, and we can do not need this max retry. Would you like to update it? |
@XiShanYongYe-Chang @RainbowMango Thanks for your replying. I'm quite glad to help update it. And I will try to bring #3808 back after it. :) |
/assign |
What happened:
As described in comment: #3808 (comment)
What you expected to happen:
After the failure cluster is recovered, these residual resources can be deleted normally.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl-karmada version
orkarmadactl version
):The text was updated successfully, but these errors were encountered: