-
Notifications
You must be signed in to change notification settings - Fork 180
Changes to multi-model guide in documentation #941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
elevran
commented
Jun 9, 2025
- clarify the use of BBR upfront (and not only as comment in YAML), and dispatching to different InferencePool/EPP
- fix typo in sample inference requests - both requests were sent to same model
- clarify the use of BBR upfront, and disptaching to different InferencePool/EPP - fix typo in example inference - both requests were sent to same model Signed-off-by: Etai Lev Ran <elevran@gmail.com>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hi @elevran. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
/lgtm this is indeed much clearer than before. I suggest to also explain (in a follow up PR) that this example is not a generic way for working with multiple pools, but is just the simplest way when the modelName == poolName. we should probably make it clear to newcomers that this is usually not the case, cause LoRA adapter name is used as modelName, and typically the adapter name is not the pool name (which often equals to the base model name). |
/approve thanks |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, elevran The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- clarify the use of BBR upfront, and disptaching to different InferencePool/EPP - fix typo in example inference - both requests were sent to same model Signed-off-by: Etai Lev Ran <elevran@gmail.com>