Skip to content

Commit 631998f

Browse files
committed
fix doc
1 parent 9f133cd commit 631998f

File tree

2 files changed

+10
-7
lines changed

2 files changed

+10
-7
lines changed

docs/_sources/py_api/runtime.rst.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Functions
1919

2020
.. autofunction:: get_whole_cudagraphs_mode
2121

22-
.. autofunction:: set_cudagraphs_mode
22+
.. autofunction:: set_cudagraphs_modue
2323

2424
.. autofunction:: enable_pre_allocated_outputs
2525

docsrc/user_guide/runtime.rst

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -97,15 +97,18 @@ Dynamic Output Allocation Mode
9797
------------------------------
9898

9999
Dynamic output allocation is a feature in Torch-TensorRT which allows the output buffer of TensorRT engines to be
100-
dynamically allocated. This is useful for models with dynamic output shapes, especially ops with data-dependent shapes.
101-
Without dynamic output allocation, the output buffer is statically allocated and the size is the maximum possible size
102-
required by the op. This can lead to inefficient memory usage if the actual output size is smaller than the maximum possible size.
100+
dynamically allocated. This is useful for models with dynamic output shapes, especially ops with data-dependent shapes.
101+
Dynamic output allocation mode cannot be used in conjunction with CUDA Graphs nor pre-allocated outputs feature.
102+
Without dynamic output allocation, the output buffer is allocated based on the inferred output shape based on input size.
103103

104104
There are two scenarios in which dynamic output allocation is enabled:
105105

106-
1. When the model contains submodules that require a dynamic output allocator at runtime, users don't have to manually enable dynamic output allocation mode.
106+
1. The model has been identified at compile time to require dynamic output allocation for at least one TensorRT subgraph.
107+
These models will engage the runtime mode automatically (with logging) and are incompatible with other runtime modes
108+
such as CUDA Graphs.
107109

108-
To specify if a module requires a dynamic output allocator, users can set the ``requires_output_allocator=True`` flag in the ``@dynamo_tensorrt_converter`` decorator of converters. e.g.,
110+
Converters can declare that subgraphs that they produce will require the output allocator using `requires_output_allocator=True`
111+
there by forcing any model which utilizes the converter to automatically use the output allocator runtime mode. e.g.,
109112

110113
.. code-block:: python
111114
@@ -123,7 +126,7 @@ To specify if a module requires a dynamic output allocator, users can set the ``
123126
) -> Union[TRTTensor, Sequence[TRTTensor]]:
124127
...
125128
126-
2. When users manually enable dynamic output allocation via the ``torch_tensorrt.runtime.enable_output_allocator`` context manager.
129+
2. Users may manually enable dynamic output allocation mode via the ``torch_tensorrt.runtime.enable_output_allocator`` context manager.
127130

128131
.. code-block:: python
129132

0 commit comments

Comments
 (0)