Description
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem this feature solves
As mentioned in issue #420, the different options for batch size, model_max_batch_size
and pipeline_max_batch_size
, can be confusing to users and it's not clear how they interact or impact performance. This config option is a legacy value from one of the first iterations of Morpheus where multiple stages would need this option to coordinate the size of messages. Since it's only used by one stage (Inference), it does not make sense to be a config option any more.
Describe your ideal solution
The model_max_batch_size
option should be removed/deprecated. Where it is still needed, (on the InferenceStage implementations), we can automatically determine the max batch size either from the model or the service. For example, the TritonInferenceStage
can determine the model_max_batch_size
during the initialization step.
To allow for backward compatibility, we could add a model_max_batch_size
property to the InferenceStage itself to override any automatically determined value.
Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
- I have searched the open feature requests and have found no duplicates for this feature request
Metadata
Assignees
Labels
Type
Projects
Status
Todo