Skip to content

Switch control plane gateway from Envoy Gateway to Traefik#89

Merged
negz merged 2 commits into
mainfrom
specific
May 27, 2026
Merged

Switch control plane gateway from Envoy Gateway to Traefik#89
negz merged 2 commits into
mainfrom
specific

Conversation

@negz

@negz negz commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Fixes #85
Fixes #88

Envoy Gateway doesn't support per-backendRef URLRewrite filters. This is a Gateway API Extended feature that allows each backend in a weighted traffic split to have its own path rewrite. Modelplane needs this to route across endpoints with different path conventions -- for example a self-hosted model serving at /v1/ alongside Groq at /openai/v1/.

The limitation is in Envoy itself: WeightedCluster.ClusterWeight doesn't support prefix_rewrite. See envoyproxy/gateway#7099. Traefik is the only confirmed Gateway API implementation that supports this feature.

This PR replaces Envoy Gateway with Traefik Proxy on the control plane. Workload clusters remain on Envoy Gateway. The HTTPRoute now uses a single rule with per-backendRef URLRewrite filters instead of grouping endpoints by rewritePath into separate rules.

The Envoy Gateway Backend CRD is replaced with standard Kubernetes Service plus EndpointSlice resources. Both IP-based endpoints (workload cluster gateway addresses) and FQDN-based endpoints (external SaaS providers) use the same pattern. ExternalName Services aren't an option because Traefik's Gateway API provider explicitly rejects them.

Gateway API only allows RequestHeaderModifier and
ResponseHeaderModifier at the backendRef level. Placing URLRewrite
filters on individual backendRefs causes Envoy Gateway to reject the
HTTPRoute with UnsupportedRefValue, resulting in HTTP 500 for all
requests through a ModelService.

This commit groups endpoints by rewritePath and emits one HTTPRoute
rule per unique path. Each rule carries the URLRewrite filter at the
rule level, where Gateway API permits it. Endpoints sharing a
rewritePath land under the same rule and load-balance within it.

Fixes #85.

Signed-off-by: Nic Cope <nicc@rk0n.org>
Copilot AI review requested due to automatic review settings May 27, 2026 20:00

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR switches the control-plane inference gateway from Envoy Gateway to Traefik so Modelplane can use per-backendRef URLRewrite behavior for endpoints with different path conventions.

Changes:

  • Updates InferenceGateway API/schema and generated Python models from Envoy Gateway to Traefik.
  • Replaces Envoy Backend composition with Kubernetes Service + EndpointSlice composition for ModelEndpoint routing.
  • Updates ModelService and InferenceGateway composition/tests for Traefik-backed Gateway API routing.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
apis/inferencegateways/definition.yaml Updates InferenceGateway schema to Traefik.
schemas/python/models/ai/modelplane/inferencegateway/v1alpha1.py Regenerates Python model fields for Traefik config.
schemas/.lock.json Updates schema lock hash.
functions/compose-inference-gateway/function/fn.py Installs/configures Traefik and composes Traefik Gateway resources.
functions/compose-inference-gateway/tests/test_fn.py Updates expected gateway composition to Traefik.
functions/compose-model-endpoint/function/fn.py Composes Services and EndpointSlices from ModelEndpoint URLs.
functions/compose-model-endpoint/tests/test_fn.py Updates endpoint composition test expectations.
functions/compose-model-service/function/fn.py Builds HTTPRoute backendRefs against Kubernetes Services.
functions/compose-model-service/tests/test_fn.py Updates HTTPRoute tests for Service backendRefs and per-backend rewrites.
functions/compose-model-deployment/function/fn.py Clarifies rewrite path comment for KServe path conventions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread functions/compose-model-endpoint/function/fn.py Outdated
Comment thread functions/compose-model-endpoint/function/fn.py
Comment thread functions/compose-model-endpoint/function/fn.py
Comment thread functions/compose-model-endpoint/function/fn.py Outdated
Comment thread functions/compose-model-endpoint/function/fn.py Outdated
Comment thread functions/compose-inference-gateway/function/fn.py Outdated
Comment thread apis/inferencegateways/definition.yaml

@negz negz left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review of LLM-assisted code.

Comment thread apis/inferencegateways/definition.yaml Outdated
Comment thread functions/compose-inference-gateway/function/fn.py Outdated
Comment thread functions/compose-model-endpoint/function/fn.py Outdated
Comment thread functions/compose-model-endpoint/function/fn.py Outdated
Comment thread functions/compose-model-endpoint/function/fn.py Outdated
Envoy Gateway doesn't support per-backendRef URLRewrite filters. This
is a Gateway API Extended feature that allows each backend in a weighted
traffic split to have its own path rewrite. Modelplane needs this to
route across endpoints with different path conventions -- for example a
self-hosted model serving at /v1/ alongside Groq at /openai/v1/.

The limitation is in Envoy itself: WeightedCluster.ClusterWeight doesn't
support prefix_rewrite. See envoyproxy/gateway#7099. Traefik is the only
confirmed Gateway API implementation that supports this feature.

This commit replaces Envoy Gateway with Traefik Proxy on the control
plane. Workload clusters remain on Envoy Gateway. The HTTPRoute now
uses a single rule with per-backendRef URLRewrite filters instead of
grouping endpoints by rewritePath into separate rules.

The Envoy Gateway Backend CRD is replaced with standard Kubernetes
Service plus EndpointSlice resources. Both IP-based endpoints (workload
cluster gateway addresses) and FQDN-based endpoints (external SaaS
providers) use the same pattern. ExternalName Services aren't an
option because Traefik's Gateway API provider explicitly rejects them.

Signed-off-by: Nic Cope <nicc@rk0n.org>

@dennis-upbound dennis-upbound left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@negz negz merged commit 46357e4 into main May 27, 2026
2 checks passed
@negz negz deleted the specific branch May 27, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support FQDNs in ModelEndpoint compose-model-service puts URLRewrite filter under backendRef, breaking the data path

3 participants