Skip to content

Conversation

elevran
Copy link
Contributor

@elevran elevran commented Jun 9, 2025

  • clarify the use of BBR upfront (and not only as comment in YAML), and dispatching to different InferencePool/EPP
  • fix typo in sample inference requests - both requests were sent to same model

- clarify the use of BBR upfront, and disptaching to different InferencePool/EPP
- fix typo in example inference - both requests were sent to same model

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
@k8s-ci-robot k8s-ci-robot requested a review from ahg-g June 9, 2025 13:37
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 9, 2025
@k8s-ci-robot k8s-ci-robot requested a review from Jeffwan June 9, 2025 13:37
Copy link

netlify bot commented Jun 9, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit b5c21a7
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6846e39640f0ae0008401cde
😎 Deploy Preview https://deploy-preview-941--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 9, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @elevran. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jun 9, 2025
@elevran elevran changed the title Changes to multi-model documentation Changes to multi-model guide in documentation Jun 9, 2025
@nirrozenbaum
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 9, 2025
@nirrozenbaum
Copy link
Contributor

/lgtm

this is indeed much clearer than before.

I suggest to also explain (in a follow up PR) that this example is not a generic way for working with multiple pools, but is just the simplest way when the modelName == poolName.

we should probably make it clear to newcomers that this is usually not the case, cause LoRA adapter name is used as modelName, and typically the adapter name is not the pool name (which often equals to the base model name).

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 9, 2025
@ahg-g
Copy link
Contributor

ahg-g commented Jun 9, 2025

/approve

thanks

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, elevran

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 9, 2025
@k8s-ci-robot k8s-ci-robot merged commit 33cda4b into kubernetes-sigs:main Jun 9, 2025
7 checks passed
rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 11, 2025
- clarify the use of BBR upfront, and disptaching to different InferencePool/EPP
- fix typo in example inference - both requests were sent to same model

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
@elevran elevran deleted the multi-model-guide branch June 16, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants