[FFI] Propagate Python errors across FFI boundaries#15596
[FFI] Propagate Python errors across FFI boundaries#15596csullivan merged 20 commits intoapache:mainfrom
Conversation
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
|
Currently marked as a draft to test for breakages. If possible, it would be good to suppress the |
e389adf to
153d3c6
Compare
|
The implementation is now updated to insert empty dummy frames representing the C++ portions of the stack trace. Variables are only interactive within Python stack frames, but ipdb does provide a snippet of the surrounding C++ source code when navigating up/down through the C++ frame. |
Prior to this commit, the `BacktraceFullCallback` function returned zero for frames that should be excluded from the stack trace, even if they were the `"TVMFuncCall"` that should terminate the stack trace. This commit re-organized `BacktraceFullCallback`, moving the terminating checks to occur prior to the suppression checks, and adding comments to indicate why each suppression is present.
Prior to this commit, if a Python script passes a callback to a C++ function, and that callback raises an exception, the original exception cannot be caught in the outer python script. As a result, interactive debugging, such as done with `pdb` or `ipdb`, could only inspect stack frames outside the first Python to C++ FFI call. This commit updates the FFI API to propagate the Python exception through an FFI boundary. As a result, all Python frames in the stack trace can be inspected.
Previously, Python exceptions were coerced to `tvm.error.TVMError` if no corresponding error type had been registered with `tvm._ffi.register_error`. With the pass-through of Python exceptions, this coercion no longer applies. Unit tests that relied on this coercion needed to be updated as a result.
3a4167f to
04d17fc
Compare
The destructor is executed when an exception propagates across the stack, and avoids losing type information during cleanup.
csullivan
left a comment
There was a problem hiding this comment.
Thank you @Lunderberg ! This greatly improves the debugability of the runtime given significant use of the FFI. Approving with only one doc nitpick.
Co-authored-by: Chris Sullivan <csullivan@octoml.ai>
|
This PR seems to cause an issue here #15880 |
|
NOTE: there has been multiple reports related to TVMGetLastPythonError not found in windows. Looking at this PR, we should have TVM_DLL marked for all the C API functions that get referenced in the ctypes. Atm we only marks |
|
I have a quick hotfix made here, which can be used for validating the |
Propagation of Python exceptions across C++ stack frames was introduced in apache#15596. This commit primarily fixes a typo in the initial implementation, where `Py_IncRef` was used instead of `Py_DecRef`. In addition, this PR resolves errors that were exposed by this typo fix, which caused test failures in `tests/python/unittest/test_crt.py::test_compile_runtime`. These were due to use of the `Py_IncRef` and `Py_DecRef` functions on threads that hadn't acquired the GIL. This usage error has been corrected for both the ctypes and cython FFI handling.
Propagation of Python exceptions across C++ stack frames was introduced in #15596. This commit primarily fixes a typo in the initial implementation, where `Py_IncRef` was used instead of `Py_DecRef`. In addition, this PR resolves errors that were exposed by this typo fix, which caused test failures in `tests/python/unittest/test_crt.py::test_compile_runtime`. These were due to use of the `Py_IncRef` and `Py_DecRef` functions on threads that hadn't acquired the GIL. This usage error has been corrected for both the ctypes and cython FFI handling.
Prior to this commit, if a Python script passes a callback to a C++ function, and that callback raises an exception, the original exception cannot be caught in the outer python script. As a result, interactive debugging, such as done with
pdboripdb, could only inspect stack frames outside the first Python to C++ FFI call.This commit updates the FFI API to propagate the Python exception through an FFI boundary. As a result, all Python frames in the stack trace can be inspected.