Skip to content

Refactor EPP to decouple core logic from InferenceModel resource #1001

@zetxqx

Description

@zetxqx

As stated in the discussion on pull request #961, the current EPP implementation has a hard dependency on the InferenceModel resource. This coupling was identified during the implementation of the GatewayFollowingEPPRouting conformance test, where the testing EPP could not function without a concrete InferenceModel.

The current EPP logic requires an InferenceModel for two main reasons I spotted:

  • To discover the universe of possible backend pods from the referenced InferencePool (code.
  • To identify the modelName it is serving (code).

Metadata

Metadata

Assignees

Labels

triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions