-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleting a flow or task run should clear any consumed concurrency limits #5995
Comments
I think we should probably add hooks to deletion of flow and task runs to free concurrency slots. |
@madkinsz In addition, what about a CLI command that clears a specific concurrency limit? |
Is a fix for this feature going to be prioritized any time soon? We have some users with a lot of interest in this feature but it's essentially unusable right now due to crashed/zombie Tasks filling up slots. This is exacerbated by the fact that we currently can't mass cancel Tasks through the UI |
Is there a way to delete these zombie task allotments directly from the DB? My current workaround is to double the configured concurrency in order to allow new flows/tasks to run as expected but that's ... not ideal. |
I can also recreate when deleting a flow run but can no longer recreate when a task run is deleted. I'd welcome any feedback or an updated MRE if this is still an issue. |
Opened from the Prefect Public Slack Community
emil.ostergaard: Hello, I have problems with prefect cloud 2.0.
We use kubernetes flow runner, and a dask task runner.
Friday (8/7-2022), I had a flow run which I wanted to abort.
I attempted to use the
delete
functionality in the UI, thinking it woulddelete all resources related to the flow_run, including the kubernetes job etc.
It did not remove the kubernetes job, so I removed this manually.
The issue is concurrency-limits: The tasks launched by this flow has a tag, with a concurrency limit.
It appears the task data associated with the deleted flow run was not removed from prefect storage.
For instance, if I try:
It shows a bunch of active task ids, even though nothing is running in k8s.
This causes an unfortunate issue where any new flow runs, for this flow, will never start tasks,
because prefect thinks the concurrency-limit is hit, due to these zombie tasks.
However, I can not seem to find a way to manually clean up these task ids, which means this flow is dead.
Any help is appreciated!
anna: Deleting a flow run will delete only the flow run, it will not terminate any external resources
Due to a hybrid model, Prefect doesn't have direct access to your infra, which is why terminating resources this way is difficult
Let me open an issue to investigating the best approach for such zombie tasks
<@ULVA73B9P> open "Investigate the right approach for cleaning up zombie task runs caused by an infrastructure crash to free up concurrency limit slots"
Original thread can be found here.
The text was updated successfully, but these errors were encountered: