-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ray debugger stepping between tasks #12075
Conversation
…pdb-task-stepping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions:
- This implementation seems to heavily depend on piggybacking on object metadata. Is this necessary or are there alternative designs that don't require this kind of deep integration?
- If we change the object metadata format, we should do that in another PR first before this one
python/ray/ray_constants.py
Outdated
OBJECT_METADATA_TYPE_ACTOR_HANDLE = b"A" | ||
|
||
# A constant indicating the debugging part of the metadata (see above). | ||
OBJECT_METADATA_DEBUG_PREFIX = b"D" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OBJECT_METADATA_DEBUG_PREFIX = b"D" | |
OBJECT_METADATA_DEBUG_PREFIX = b"DEBUG:" |
@ericl: I first tried to do it only by passing things around through the internal_kv store, but that approach is unfortunately not workable since it would always add a redis lookup into the critical path of task execution (to check if we need to drop into the debugger). I will make another PR to refactor the metadata handling first and do it using a comma separated list! |
…pdb-task-stepping
@pcmoritz I'm trying this out with
And it seems like after I call
|
Thanks Eric, I fixed this now and added it as a test case! |
Why are these changes needed?
This PR adds the capability to step between Ray tasks in the debugger. It adds two new commands to the debugger,
"remote", which will fast-forward to the next
.remote
call and then go inside it and attach the debugger there so users can inspect the status/stack/etc. It also adds a new command "get", which will finish the current task and enter the debugger whenray.get
is called on the result of the task.I will add an example and more documentation in a separate PR.
This functionality is implemented by tracing through the
.remote
codepath and the object return codepath. This allows us to provide the functionality without an extra flag and no additional costs during normal runtime. The metadata for objects had to be slightly altered to make it possible to store the additional information required to implement this.Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.