Requests to move config params #1

laikhtewari · 2024-11-20T20:13:30Z

No description provided.

laikhtewari · 2024-11-20T20:14:27Z

docs/model_config.md

@@ -73,20 +73,20 @@ description of the parameters below.

 | Name | Description |
 | :----------------------: | :-----------------------------: |
-| `triton_backend` | The backend to use for the model. Set to `tensorrtllm` to utilize the C++ TRT-LLM backend implementation. Set to `python` to utlize the TRT-LLM Python runtime. |


Why would anyone ever use the python runtime today? Maybe this is needed for future runtime configuration, but at most should be an optional with default to tensorrtllm

laikhtewari · 2024-11-20T20:14:55Z

docs/model_config.md

 | `triton_max_batch_size` | The maximum batch size that the Triton model instance will run with. Note that for the `tensorrt_llm` model, the actual runtime batch size can be larger than `triton_max_batch_size`. The runtime batch size will be determined by the TRT-LLM scheduler based on a number of parameters such as number of available requests in the queue, and the engine build `trtllm-build` parameters (such `max_num_tokens` and `max_batch_size`). |
-| `decoupled_mode` | Whether to use decoupled mode. Must be set to `true` for requests setting the `stream` tensor to `true`. |


I think these should all be moved to optional

Requests to move config params

caee364

laikhtewari commented Nov 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Requests to move config params #1

Requests to move config params #1

Uh oh!

laikhtewari commented Nov 20, 2024

Uh oh!

laikhtewari Nov 20, 2024

Uh oh!

laikhtewari Nov 20, 2024

Uh oh!

Uh oh!

		\| `triton_max_batch_size` \| The maximum batch size that the Triton model instance will run with. Note that for the `tensorrt_llm` model, the actual runtime batch size can be larger than `triton_max_batch_size`. The runtime batch size will be determined by the TRT-LLM scheduler based on a number of parameters such as number of available requests in the queue, and the engine build `trtllm-build` parameters (such `max_num_tokens` and `max_batch_size`). \|
		\| `decoupled_mode` \| Whether to use decoupled mode. Must be set to `true` for requests setting the `stream` tensor to `true`. \|

Requests to move config params #1

Are you sure you want to change the base?

Requests to move config params #1

Uh oh!

Conversation

laikhtewari commented Nov 20, 2024

Uh oh!

laikhtewari Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

laikhtewari Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!