-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TaskCancellation #7669
TaskCancellation #7669
Conversation
Can one of the admins verify this patch? |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test FAILed. |
python/ray/worker.py
Outdated
worker = ray.worker.global_worker | ||
worker.check_connected() | ||
worker.core_worker.kill_actor(actor._ray_actor_id, False) | ||
|
||
if isinstance(id, ray.actor.ActorHandle): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move this to a new PR later
Test FAILed. |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
src/ray/core_worker/core_worker.cc
Outdated
auto task_id = object_id.TaskId(); | ||
if (task_manager_->IsTaskPending(task_id)) { | ||
auto task_spec = task_manager_->GetTaskSpec(object_id.TaskId()); | ||
if (!task_spec.IsActorCreationTask()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do a RAY_CHECK instead of if statement here because this is definitely checked by python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it actually is.
src/ray/core_worker/core_worker.cc
Outdated
boost::bind(&CoreWorker::TryKillTask, this, task_id, num_tries - 1)); | ||
} | ||
|
||
void CoreWorker::HandleKillTask(const rpc::KillTaskRequest &request, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happen if the task is inside worker's scheduling queue and timeout expired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For direct tasks, the task is never entered into the scheduling queue, it is immediately executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about for actor tasks? Is that not supported at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
Test FAILed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
Nice catch! I think this is a mistake. Actor handle can only be killed with
ray.kill. Feel free to submit a quick doc fix!
…On Mon, May 25, 2020 at 9:11 PM Mitar ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In python/ray/worker.py
<#7669 (comment)>:
> worker = ray.worker.global_worker
worker.check_connected()
worker.core_worker.kill_actor(actor._ray_actor_id, False)
+def cancel(object_id, force=False):
+ """Kill a task forcefully.
+
+ This will interrupt any running tasks on the actor, causing them to fail
+ immediately. Any atexit handlers installed in the actor will still be run.
+
+ If this actor is reconstructable, it will be attempted to be reconstructed.
+
+ Args:
+ id (ActorHandle or ObjectID): Handle for the actor to kill or ObjectID
+ of the task to kill.
Why it says that ActorHandle is allowed here, but then it raises an
exception if one passes one?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#7669 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFBD7A7U2Q6G6IWOFWB3UXDRTM6O3ANCNFSM4LP3W54A>
.
|
Not sure why there are two calls here though. It seems pretty easy to expose to the user only one? API proliferation? |
@mitar We may eventually combine them into one API. The main reason for separating them was that |
But canceling a running task still kills it, no? |
|
Yea, so we could have Anyway, it works for me. I am just surprised with this design decisions. :-) Keep up with good work. :-) Ray is amazing. |
Task Cancellation for locally submitted tasks. Cancellation for remote tasks and the full API will be added in follow-up PRs.
Why are these changes needed?
Related issue number
Closes #854
Checks
scripts/format.sh
to lint the changes in this PR.