Skip to content

chore: Improve error propagation for torch compile #2106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gs-olive
Copy link
Collaborator

Description

  • Propagate original error through to the user instead of masking with AssertionError
  • Helpful for debugging complex models and converter errors more efficiently

Type of change

  • Logging improvement

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ x ] I have added tests to verify my fix or my feature
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

- Propagate original error through to the user instead of masking with
`AssertionError`
- Helpful for debugging complex models and converter errors more
efficiently
@gs-olive gs-olive added component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths Story: Dynamo Compile Improvements Issues relating to improvement of the Dynamo compile path labels Jul 12, 2023
@gs-olive gs-olive requested a review from narendasan July 12, 2023 22:35
@gs-olive gs-olive self-assigned this Jul 12, 2023
@github-actions github-actions bot added the component: api [Python] Issues re: Python API label Jul 12, 2023
@gs-olive gs-olive requested a review from peri044 July 13, 2023 21:34
"Halting compilation on build failure since "
+ "pass_through_build_failures was specified as True. "
+ "To return the default Torch implementation and avoid "
+ "halting compilation on engine build failures, "
+ "specify pass_through_build_failures=False."
)
raise
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example what the new error message looks like compared to the old?

Copy link
Collaborator Author

@gs-olive gs-olive Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here is an example:

Prior to this PR (no mention of the assertion-causing error)
Traceback (most recent call last):
...
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/__init__.py", line 112, in compile
    model(*inputs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 294, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 675, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 282, in __call__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 272, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 447, in catch_errors
    return callback(frame, cache_size, hooks, frame_state)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 531, in _convert_frame
    result = inner_convert(frame, cache_size, hooks, frame_state)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 127, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 360, in _convert_frame_assert
    return _compile(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 179, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 430, in _compile
    out_code = transform_code_object(code, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 1000, in transform_code_object
    transformations(instructions, code_options)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 415, in transform
    tracer.run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2029, in run
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 708, in run
    and self.step()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 668, in step
    getattr(self, inst.opname)(inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2117, in RETURN_VALUE
    self.output.compile_subgraph(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 802, in compile_subgraph
    self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 874, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 179, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 930, in call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 926, in call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1574, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 33, in torch_tensorrt_backend
    return DEFAULT_BACKEND(gm, sample_inputs, **kwargs)
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 51, in aot_torch_tensorrt_aten_backend
    return aot_module_simplified(
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3715, in aot_module_simplified
    compiled_fn = create_aot_dispatcher_function(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 179, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3254, in create_aot_dispatcher_function
    compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2070, in aot_wrapper_dedupe
    return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2250, in aot_wrapper_synthetic_base
    return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 1524, in aot_dispatch_base
    compiled_fw = compiler(fw_module, flat_args)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 1442, in f
    out_f = compiler(fx_g, inps)
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 91, in _pretraced_backend
    raise AssertionError(
torch._dynamo.exc.BackendCompilerFailed: backend="functools.partial(<function torch_tensorrt_backend at 0x7ff3bb47c040>, debug=True, precision=<LowerPrecision.FP32: 'fp32'>, workspace_size=0, min_block_size=10, torch_executed_ops=[], pass_through_build_failures=True, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=None, use_experimental_rt=True)" raised:
AssertionError: Halting compilation on build failure since pass_through_build_failures was specified as True. To return the default Torch implementation and avoid halting compilation on engine build failures, specify pass_through_build_failures=False.
With this PR (exact line number and location of error)
CRITICAL:torch_tensorrt.dynamo.backend.backends:Halting compilation on build failure since pass_through_build_failures was specified as True. To return the default Torch implementation and avoid halting compilation on engine build failures, specify pass_through_build_failures=False.
Traceback (most recent call last):
...
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/__init__.py", line 112, in compile
    model(*inputs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 294, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 675, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 282, in __call__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/graph_module.py", line 272, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 447, in catch_errors
    return callback(frame, cache_size, hooks, frame_state)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 531, in _convert_frame
    result = inner_convert(frame, cache_size, hooks, frame_state)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 127, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 360, in _convert_frame_assert
    return _compile(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 179, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 430, in _compile
    out_code = transform_code_object(code, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 1000, in transform_code_object
    transformations(instructions, code_options)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 415, in transform
    tracer.run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2029, in run
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 708, in run
    and self.step()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 668, in step
    getattr(self, inst.opname)(inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2117, in RETURN_VALUE
    self.output.compile_subgraph(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 802, in compile_subgraph
    self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 874, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 179, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 930, in call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 926, in call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1574, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 33, in torch_tensorrt_backend
    return DEFAULT_BACKEND(gm, sample_inputs, **kwargs)
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 51, in aot_torch_tensorrt_aten_backend
    return aot_module_simplified(
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3715, in aot_module_simplified
    compiled_fn = create_aot_dispatcher_function(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 179, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3254, in create_aot_dispatcher_function
    compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2070, in aot_wrapper_dedupe
    return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2250, in aot_wrapper_synthetic_base
    return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 1524, in aot_dispatch_base
    compiled_fw = compiler(fw_module, flat_args)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 1442, in f
    out_f = compiler(fx_g, inps)
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 76, in _pretraced_backend
    trt_compiled = _compile_module(
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py", line 139, in _compile_module
    trt_mod = convert_module(
  File "~/TensorRT/py/torch_tensorrt/dynamo/backend/conversion.py", line 45, in convert_module
    interpreter_result = interpreter.run(
  File "~/TensorRT/py/torch_tensorrt/dynamo/fx_ts_compat/fx2trt.py", line 212, in run
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/interpreter.py", line 138, in run
    self.env[node] = self.run_node(node)
  File "~/TensorRT/py/torch_tensorrt/dynamo/fx_ts_compat/fx2trt.py", line 303, in run_node
    trt_node = super().run_node(n)
  File "/usr/local/lib/python3.10/dist-packages/torch/fx/interpreter.py", line 195, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "~/TensorRT/py/torch_tensorrt/dynamo/fx_ts_compat/fx2trt.py", line 358, in call_function
    return converter(self.network, target, args, kwargs, self._cur_node_name)
  File "~/TensorRT/py/torch_tensorrt/dynamo/converters/aten_ops_converters.py", line 280, in aten_ops_slice
    args[4],
torch._dynamo.exc.BackendCompilerFailed: backend="functools.partial(<function torch_tensorrt_backend at 0x7f30a55a0040>, debug=True, precision=<LowerPrecision.FP32: 'fp32'>, workspace_size=0, min_block_size=10, torch_executed_ops=[], pass_through_build_failures=True, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=None, use_experimental_rt=True)" raised:
IndexError: tuple index out of range

While executing %slice_7 : [num_users=1] = call_function[target=torch.ops.aten.slice.Tensor](args = (%add_171, 0, 0, 9223372036854775807), kwargs = {_itensor_to_tensor_meta: {<tensorrt.tensorrt.ITensor object at 0x7f30a0473270>: ((1, 14, 768),

@gs-olive gs-olive requested a review from narendasan July 14, 2023 00:33
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gs-olive gs-olive merged commit 230cf6a into pytorch:main Jul 17, 2023
@gs-olive gs-olive deleted the compile_error_propagation_improvement branch July 17, 2023 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths Story: Dynamo Compile Improvements Issues relating to improvement of the Dynamo compile path
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants