Skip to content

compose-model-service puts URLRewrite filter under backendRef, breaking the data path #85

Description

@negz

What happened?

Every request through a ModelService returns HTTP 500 from the control plane gateway. The model serving pod on the workload cluster is healthy and never receives the request.

compose-model-service builds an HTTPRoute that attaches a URLRewrite filter to each backendRef:

https://github.com/modelplaneai/modelplane/blob/main/functions/compose-model-service/function/fn.py#L181-L192

The Gateway API spec only allows RequestHeaderModifier and ResponseHeaderModifier at the backendRef level. URLRewrite is a rule-level-only filter. Envoy Gateway rejects the route accordingly:

status:
  parents:
  - conditions:
    - type: ResolvedRefs
      status: "False"
      reason: UnsupportedRefValue
      message: Specific filter is not supported within BackendRef, only
        RequestHeaderModifier and ResponseHeaderModifier are supported

With the route unresolved, Envoy emits a direct_response for every matching request:

{
  "method": "POST",
  "x-envoy-origin-path": "/ml-team/qwen/v1/chat/completions",
  "response_code": "500",
  "response_code_details": "direct_response",
  "route_name": "httproute/ml-team/qwen-3e76dee66d25/rule/0/match/0/*"
}

The function comment says per-backend filters are intentional, to support endpoints with different rewritePath values:

Each backendRef carries its own URLRewrite filter so endpoints with different rewritePaths (e.g. composed replicas vs. external SaaS providers) are rewritten correctly per-backend.

But Gateway API doesn't allow this shape.

How can we reproduce it?

Walk through docs/getting-started.md to deploy qwen-demo on GKE. Once ModelDeployment reports READY=True, hit the endpoint:

address=$(kubectl get ms qwen -n ml-team -o jsonpath='{.status.address}')
kubectl run curl-test --image=curlimages/curl --restart=Never \
  --command -- curl -sS -i "$address/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen/Qwen2.5-0.5B-Instruct","messages":[{"role":"user","content":"hi"}]}'
kubectl wait --for=jsonpath='{.status.phase}'=Succeeded pod/curl-test
kubectl logs curl-test

Response: HTTP/1.1 500 Internal Server Error.

Then inspect the composed HTTPRoute:

kubectl get httproute -n ml-team -o yaml

The ResolvedRefs condition reports UnsupportedRefValue with the message above.

What environment did it happen in?

Modelplane version: v0.1.0-dev.1779486230.g401e962-dirty (built from #83)
Crossplane version: v2.3.1
Inference backend: KServe v0.16.0
Cloud provider: GKE (g2-standard-8, NVIDIA L4)
Kubernetes version: control plane v1.34.0 (kind), workload v1.35.3-gke.1389000
provider-helm: v1.2.4
provider-kubernetes: v1.2.5

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions