Refactor EPP to decouple core logic from `InferenceModel` resource

As stated in the [discussion](https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/961#discussion_r2151072149) on pull request [#961](https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/961), the current EPP implementation has a hard dependency on the InferenceModel resource. This coupling was identified during the implementation of the GatewayFollowingEPPRouting conformance test, where the testing EPP could not function without a concrete InferenceModel.

The current EPP logic requires an InferenceModel for two main reasons I spotted:
* To discover the universe of possible backend pods from the referenced InferencePool ([code](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/epp/scheduling/scheduler.go#L108).
* To identify the modelName it is serving ([code](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/epp/requestcontrol/director.go#L94-L97)).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor EPP to decouple core logic from `InferenceModel` resource #1001

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor EPP to decouple core logic from InferenceModel resource #1001

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Refactor EPP to decouple core logic from `InferenceModel` resource #1001