Skip to content

[FEA]: Remove/Deprecate the model_max_batch_size config option #421

Open

Description

Is this a new feature, an improvement, or a change to existing functionality?

Improvement

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem this feature solves

As mentioned in issue #420, the different options for batch size, model_max_batch_size and pipeline_max_batch_size, can be confusing to users and it's not clear how they interact or impact performance. This config option is a legacy value from one of the first iterations of Morpheus where multiple stages would need this option to coordinate the size of messages. Since it's only used by one stage (Inference), it does not make sense to be a config option any more.

Describe your ideal solution

The model_max_batch_size option should be removed/deprecated. Where it is still needed, (on the InferenceStage implementations), we can automatically determine the max batch size either from the model or the service. For example, the TritonInferenceStage can determine the model_max_batch_size during the initialization step.

To allow for backward compatibility, we could add a model_max_batch_size property to the InferenceStage itself to override any automatically determined value.

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    • Status

      Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions