Skip to content

Conversation

zetxqx
Copy link
Contributor

@zetxqx zetxqx commented Sep 9, 2025

What type of PR is this?
/kind bug
/kind helm

What this PR does / why we need it:
The current GKE helm chart is not working well with v1alpha2 version.

Verified the current fix working by the following step

  1. create model server, use model sim for easy deploy
apiVersion: apps/v1
kind: Deployment
metadata:
  name: vllmsim
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vllmsim
  template:
    metadata:
      labels:
        app: vllmsim
    spec:
      containers:
      - name: vllm-sim
        image: ghcr.io/llm-d/llm-d-inference-sim:v0.3.0
        imagePullPolicy: Always
        args:
        - --model
        - meta-llama/Llama-3.1-8B-Instruct
        - --port
        - "8000"
        - --max-loras
        - "2"
        - --lora-modules
        - '{"name": "food-review-1"}'
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        ports:
        - containerPort: 8000
          name: http
          protocol: TCP
        resources:
          requests:
            cpu: 10m
  1. install helm
helm install vllminfpool ./config/charts/inferencepool \
--set inferencePool.modelServers.matchLabels.app=vllmsim \
--set provider.name=gke \
--set inferencePool.apiVersion=inference.networking.x-k8s.io/v1alpha2
  1. install gateway and httproute
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: llm-route
spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: inference-gateway
  rules:
  - backendRefs:
    - group: inference.networking.x-k8s.io
      kind: InferencePool
      name: vllminfpool
    matches:
    - path:
        type: PathPrefix
        value: /
EOF

Which issue(s) this PR fixes:

NONE

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 9, 2025
@k8s-ci-robot
Copy link
Contributor

@zetxqx: The label(s) kind/helm cannot be applied, because the repository doesn't have them.

In response to this:

What type of PR is this?
/kind bug
/kind helm

What this PR does / why we need it:
The current GKE helm chart is not working well with v1alpha2 version.

Verified the current fix working by the following step

  1. create model server, use model sim for easy deploy
apiVersion: apps/v1
kind: Deployment
metadata:
 name: vllmsim
spec:
 replicas: 1
 selector:
   matchLabels:
     app: vllmsim
 template:
   metadata:
     labels:
       app: vllmsim
   spec:
     containers:
     - name: vllm-sim
       image: ghcr.io/llm-d/llm-d-inference-sim:v0.3.0
       imagePullPolicy: Always
       args:
       - --model
       - meta-llama/Llama-3.1-8B-Instruct
       - --port
       - "8000"
       - --max-loras
       - "2"
       - --lora-modules
       - '{"name": "food-review-1"}'
       env:
       - name: POD_NAME
         valueFrom:
           fieldRef:
             fieldPath: metadata.name
       - name: NAMESPACE
         valueFrom:
           fieldRef:
             fieldPath: metadata.namespace
       ports:
       - containerPort: 8000
         name: http
         protocol: TCP
       resources:
         requests:
           cpu: 10m
  1. install helm
helm install vllminfpool ./config/charts/inferencepool \
--set inferencePool.modelServers.matchLabels.app=vllmsim \
--set provider.name=gke \
--set inferencePool.apiVersion=inference.networking.x-k8s.io/v1alpha2
  1. install gateway and httproute
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
 name: llm-route
spec:
 parentRefs:
 - group: gateway.networking.k8s.io
   kind: Gateway
   name: inference-gateway
 rules:
 - backendRefs:
   - group: inference.networking.x-k8s.io
     kind: InferencePool
     name: vllminfpool
   matches:
   - path:
       type: PathPrefix
       value: /
EOF

Which issue(s) this PR fixes:

NONE

Does this PR introduce a user-facing change?:

NONE

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 9, 2025
Copy link

netlify bot commented Sep 9, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 8f91ea4
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68bfc78330c6c10008f4c2c1
😎 Deploy Preview https://deploy-preview-1551--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 9, 2025
@nirrozenbaum
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 9, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nirrozenbaum, zetxqx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 9, 2025
@k8s-ci-robot k8s-ci-robot merged commit dda6407 into kubernetes-sigs:main Sep 9, 2025
10 checks passed
@kfswain kfswain mentioned this pull request Sep 19, 2025
@zetxqx zetxqx deleted the helmfix branch September 29, 2025 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants