Skip to content

Conversation

RossmacD
Copy link
Contributor

Fallback to finding deployments rather than worker for deletion, as otherwise the pod can have the incorrect name, leading to workers not scaling down in adaptive mode.

Believe this was part of the issue on #910

@jacobtomlinson jacobtomlinson merged commit 46350db into dask:main Oct 25, 2024
9 checks passed
@briceruzand
Copy link

Thanks @RossmacD , it looks like my trouble #855 (comment) (404 on deployments, because using pods name instead of deployments name)
I will try this fix on next release.

@briceruzand
Copy link

Try the fix. Thx a lot, my dask cluster can now scale down. 🎆

But it never scale down to 0, do you have any idea ?
I need to do some adjustment to avoid scale up/down flapping ;-)

Do you plan to release a new version with that fix ?

@briceruzand
Copy link

resolve #855

@fcourtial
Copy link
Contributor

fcourtial commented Jan 16, 2025

@jacobtomlinson could we make a release with this fix ?

The current version prevents the dask auto scaler to down scale, so if the cluster creates 100 workers, they will never be deleted and be restarted given the pods come from a Deployment.

We cannot really use the auto scaling because of the cost of the Pods kept alive.

Best regards.

@jacobtomlinson
Copy link
Member

Sure @fcourtial I just tagged 2025.1.0.

@fcourtial
Copy link
Contributor

Thanks @jacobtomlinson !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants