-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] handle unserializable user exception #44878
Conversation
c6b8351
to
18956df
Compare
18956df
to
d06fd57
Compare
3da892f
to
b1185e6
Compare
b1185e6
to
e21ff7d
Compare
c5548c1
to
7ed9740
Compare
d2dcfd3
to
ffdf7ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests?
python/ray/exceptions.py
Outdated
try: | ||
pickle.dumps(self.cause) | ||
except Exception as e: | ||
err_msg = f"Unable to serialize exception ({self.cause.__class__}): {e}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original cause of RayTaskError ({self.cause.__class__}) is non-serializable: {e}. Overwriting the cause to be RayError
?
@jjyao Is there any existing tests on testing exceptions? |
ffdf7ee
to
3c4c1b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code lgtm
python/ray/exceptions.py
Outdated
try: | ||
pickle.dumps(self.cause) | ||
except Exception as e: | ||
err_msg = f"The original cause of RayTaskError ({self.cause.__class__}) is not serializable: {e}. Overwriting the cause to RayError." # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can be split into multiple lines to avoid # noqa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dab4719
to
f5f6d2e
Compare
@jjyao Test and doc added. PTAL! |
@@ -27,6 +27,8 @@ field of the ``RayTaskError``. | |||
:start-after: __task_exceptions_begin__ | |||
:end-before: __task_exceptions_end__ | |||
|
|||
If the user's exception cannot be serialized, it would be converted to a RayError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why it ends with colon :
.
and also maybe change to active tense, like:
If Ray cannot serialize the user's exception, it will convert the exception to a ``RayError``
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Missed some sample code before. Fixed
Signed-off-by: hongchaodeng <hongchaodeng1@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some attempts at some style corrections. Please fix if not correct.
print(type(e.cause)) | ||
# <class 'ray.exceptions.RayError'> | ||
print(e.cause) | ||
# The original cause of RayTaskError (<class '__main__.UnserializableException'>) is not serializable: cannot pickle '_thread.lock' object. Overwriting the cause to RayError. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The original cause of RayTaskError (<class '__main__.UnserializableException'>) is not serializable: cannot pickle '_thread.lock' object. Overwriting the cause to RayError. | |
# The original cause of RayTaskError (<class '__main__.UnserializableException'>) isn't serializable: can't pickle '_thread.lock' object. Overwriting the cause to a RayError. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is not my writings.
It's output from the program runs.
So let's not change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I just see you suggest changes on the code as well. I will update that and rerun the program to reflect it.
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Hongchao Deng <hongchaodeng1@gmail.com>
Signed-off-by: hongchaodeng <hongchaodeng1@gmail.com>
@jjyao Ready for review. PTAL |
If Ray can't serialize the user's exception, it converts the exception to a ``RayError``. | ||
|
||
.. literalinclude:: ../doc_code/task_exceptions.py | ||
:language: python | ||
:start-after: __unserializable_exceptions_begin__ | ||
:end-before: __unserializable_exceptions_end__ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move it down to after line 49. Currently its in the middle of subclass section.
python/ray/exceptions.py
Outdated
) | ||
logger.warning(err_msg) | ||
self.cause = RayError(err_msg) | ||
self.args = (function_name, traceback_str, self.cause, proctitle, pid, ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can just set self.args in the end of the constructor so we don't need to set it twice. basically move line 122 down to the end.
Signed-off-by: hongchaodeng <hongchaodeng1@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment
logger.warning(err_msg) | ||
self.cause = RayError(err_msg) | ||
|
||
self.args = (function_name, traceback_str, self.cause, proctitle, pid, ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move the comments (line 119-121) to here as well since they are comments for self.args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: hongchaodeng <hongchaodeng1@gmail.com>
If I run the following script, core worker would fail to serialize the RayError and crash
(credit to @alexeykudinkin )
This PR will fix this.