Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] handle unserializable user exception #44878

Merged
merged 11 commits into from
May 2, 2024
26 changes: 26 additions & 0 deletions doc/source/ray-core/doc_code/task_exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,32 @@ def g(x):
# Exception: the real error

# __task_exceptions_end__
# __unserializable_exceptions_begin__

import threading

class UnserializableException(Exception):
def __init__(self):
self.lock = threading.Lock()

@ray.remote
def raise_unserializable_error():
raise UnserializableException

try:
ray.get(raise_unserializable_error.remote())
except ray.exceptions.RayTaskError as e:
print(e)
# ray::raise_unserializable_error() (pid=222813, ip=172.31.5.154)
# File "/home/ubuntu/ray/tmp~/main.py", line 25, in raise_unserializable_error
# raise UnserializableException
# UnserializableException
print(type(e.cause))
# <class 'ray.exceptions.RayError'>
print(e.cause)
# The original cause of RayTaskError (<class '__main__.UnserializableException'>) is not serializable: cannot pickle '_thread.lock' object. Overwriting the cause to RayError.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The original cause of RayTaskError (<class '__main__.UnserializableException'>) is not serializable: cannot pickle '_thread.lock' object. Overwriting the cause to RayError.
# The original cause of RayTaskError (<class '__main__.UnserializableException'>) isn't serializable: can't pickle '_thread.lock' object. Overwriting the cause to a RayError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is not my writings.
It's output from the program runs.
So let's not change it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I just see you suggest changes on the code as well. I will update that and rerun the program to reflect it.


# __unserializable_exceptions_end__
# __catch_user_exceptions_begin__

class MyException(Exception):
Expand Down
7 changes: 7 additions & 0 deletions doc/source/ray-core/fault_tolerance/tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,13 @@ field of the ``RayTaskError``.
:start-after: __task_exceptions_begin__
:end-before: __task_exceptions_end__

If the user's exception cannot be serialized, it would be converted to a ``RayError``:
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved

.. literalinclude:: ../doc_code/task_exceptions.py
:language: python
:start-after: __unserializable_exceptions_begin__
:end-before: __unserializable_exceptions_end__

Example code of catching the user exception type when the exception type can be subclassed:

.. literalinclude:: ../doc_code/task_exceptions.py
Expand Down
14 changes: 13 additions & 1 deletion python/ray/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,8 +130,20 @@ def __init__(
self.traceback_str = traceback_str
self.actor_repr = actor_repr
self._actor_id = actor_id
# TODO(edoakes): should we handle non-serializable exception objects?
self.cause = cause

hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved
try:
pickle.dumps(cause)
except (pickle.PicklingError, TypeError) as e:
err_msg = (
"The original cause of RayTaskError"
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved
f" ({self.cause.__class__}) is not serializable: {e}."
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved
" Overwriting the cause to RayError."
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved
)
logger.warning(err_msg)
self.cause = RayError(err_msg)
self.args = (function_name, traceback_str, self.cause, proctitle, pid, ip)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can just set self.args in the end of the constructor so we don't need to set it twice. basically move line 122 down to the end.


assert traceback_str is not None

def make_dual_exception_type(self) -> Type:
Expand Down
20 changes: 20 additions & 0 deletions python/ray/tests/test_failure.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import sys
import time
import logging
import threading

import numpy as np
import pytest
Expand Down Expand Up @@ -600,6 +601,25 @@ def check_actor_restart():
ray.get(obj2)


# Previously when there is threading.Lock in the exception, it will cause
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved
# the serialization to fail. This test case is to cover this scenario.
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved
def test_unserializable_exception(ray_start_regular, propagate_logs):
class UnserializableException(Exception):
def __init__(self):
self.lock = threading.Lock()

@ray.remote
def func():
raise UnserializableException

with pytest.raises(ray.exceptions.RayTaskError) as exc_info:
ray.get(func.remote())

assert isinstance(exc_info.value, ray.exceptions.RayTaskError)
assert isinstance(exc_info.value.cause, ray.exceptions.RayError)
assert "is not serializable" in str(exc_info.value.cause)
hongchaodeng marked this conversation as resolved.
Show resolved Hide resolved


def test_final_user_exception(ray_start_regular, propagate_logs, caplog):
class MyFinalException(Exception):
def __init_subclass__(cls, /, *args, **kwargs):
Expand Down
Loading