Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: kill actor via ray kill actors *actor_id* or via dashboard #39240

Open
raycharleston opened this issue Sep 3, 2023 · 2 comments
Open

CLI: kill actor via ray kill actors *actor_id* or via dashboard #39240

raycharleston opened this issue Sep 3, 2023 · 2 comments
Labels
core Issues that should be addressed in Ray Core dashboard Issues specific to the Ray Dashboard enhancement Request for new feature and/or capability P2 Important issue, but not time-critical

Comments

@raycharleston
Copy link

raycharleston commented Sep 3, 2023

Description

Allow killing an actor from the CLI and/or dashboard. I know we can use ray.kill(handle) from within a driver script or from within the job submitted to a cluster but what about killing a misbehaving actor if the job does not have the kill logic?

I previously mentioned this on the message board and was asked to open this issue.

Link to message board post: https://discuss.ray.io/t/how-to-kill-actor-from-cli-or-dashboard/11952

Use case

I'd like to be able to kill an actor from the Dashboard via a Kill button, which would terminate the actor and call a function with a conventional name(def exit, or def exit) within the Actor if defined.

Following the same pattern, I'd like to be able to terminate an Actor from the CLI. We already have 'ray list actors' , I think it makes sense also to have 'ray kill actors actor_id actor_id actor_id'. If called from the CLI we would also call the same def kill or def kill function on the Actor.

I'm working on an application that will leverage detached named actors as well as non-named Actors in many ActorPools; while working on some operational documentation for the application, I came across this question and realized there is not a method to kill an actor (other than from within the job code) of course we could always figure out the PID for the actor and kill that, but that solution is not very user friendly and in certain environments might require a sysadmin.

How does the community handle terminating hung/runaway actors when the job itself isn't smart enough to recognize an actor is hung and perform the cleanup without user interaction? Or when you don't want to kill the entire job?

If we did go down the path of allowing the termination of an actor from the dashboard and/or CLI, we would want to cleanup the ActorPool references to that actor so the pool does not have references to the terminated Actor.

@raycharleston raycharleston added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 3, 2023
@cadedaniel cadedaniel added core Issues that should be addressed in Ray Core dashboard Issues specific to the Ray Dashboard labels Sep 25, 2023
@rkooo567 rkooo567 added P1.5 Issues that will be fixed in a couple releases. It will be bumped once all P1s are cleared and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 25, 2023
@OpenCoderX
Copy link

This would be a great feature. I just found myself in a situation where I launched a detached actor with max restarts=-1 from a job. It seems like there is no way to kill that actor now, is that correct? Should I just reboot the entire cluster?

@zhangkuantian
Copy link

We have started a bunch of unnamed actors through ActorPools. When an actor fails, it keeps retrying, and resources are not being released. However, we currently cannot find a way to kill an actor using its actor ID; the only way to clean up anonymous actors is to restart the cluster, which is too costly. We urgently need a method or command to kill an actor using its actor ID.

@jjyao jjyao added P2 Important issue, but not time-critical and removed P1.5 Issues that will be fixed in a couple releases. It will be bumped once all P1s are cleared labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues that should be addressed in Ray Core dashboard Issues specific to the Ray Dashboard enhancement Request for new feature and/or capability P2 Important issue, but not time-critical
Projects
None yet
Development

No branches or pull requests

6 participants