-
Notifications
You must be signed in to change notification settings - Fork 8
[EAGLE-5341] Load model proto once to stop expecting it from predict request #822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes model runner performance by loading the model proto once at initialization instead of expecting it with every predict request from the API. The model proto is now retrieved via a single GetModel call and reused across all subsequent requests.
Key Changes
- Added
get_model_proto()method to ModelBuilder to fetch model metadata once via GetModel API - Updated ModelRunner to accept and store model_proto during initialization
- Modified predict, generate, and stream methods to inject the cached model_proto into requests
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| clarifai/runners/models/model_builder.py | Adds get_model_proto() method to retrieve and return model proto via GetModel API call |
| clarifai/runners/server.py | Passes the retrieved model_proto to ModelRunner during initialization |
| clarifai/runners/models/model_runner.py | Updates constructor to accept model_proto parameter and injects it into all request types (predict, generate, stream) |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
This reverts commit f603dff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue with this is we'll overwrite request type secrets if we don't also add handling for request type secrets to GetModel.
we'll need to ensure that we're populating the value for request type secrets in GetModel endpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Minimum allowed line rate is |
luv-bansal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
…request (#822) * load model proto from GetModel request and set it for every predict request * Update clarifai/runners/models/model_runner.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Revert "Update clarifai/runners/models/model_runner.py" This reverts commit f603dff. * fix suggestion * only set if none * suggestion * Update clarifai/runners/models/model_runner.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix condition * fix tests * Update clarifai/runners/models/model_runner.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update clarifai/runners/models/model_builder.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix test --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
|
||
| def start_servicer(self, port, pool_size, max_queue_size, max_msg_length, enable_tls): | ||
| # initialize the servicer with the runner so that it gets the predict(), generate(), stream() classes. | ||
| self._servicer = ModelServicer(self._current_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this path missing the model proto since it's only added to the ModelRunner. How come we wouldn't GetModel in the self._current_model so it's proto and model class are always together regardless of how they are served. @luv-bansal @wemoveon2
What
Why
How
Tests
Notes