-
Notifications
You must be signed in to change notification settings - Fork 179
Description
Summary
The EPP (End Point Picker) application currently requires a -poolName
command-line argument at startup, which ties a single EPP process to a single InferencePool resource. This prevents a simpler architecture where one shared EPP deployment could serve all InferencePools within a given namespace.
Current Behavior
As discovered while debugging the HTTPRouteMultipleRulesDifferentPools (see #834 ) conformance test, the EPP application will not start without the -poolName flag. The pod logs show a fatal error on startup:
{"error": "required \"poolName\" flag not set"}
This requirement is enforced by the validateFlags function in cmd/main.go. Additionally the EPP requires a health check component that is currently 1-1 with the an inference pool.
Impact
This design has two main impacts:
-
It prevents users from deploying a single, shared EPP instance to handle routing for an entire namespace of InferencePools.
-
It complicates the conformance test setup. For example, the HTTPRouteMultipleRulesDifferentPools test requires two separate EPP Deployments and Services to be defined in its manifest because it tests two distinct InferencePools. This adds boilerplate and prevents us from defining a single shared EPP in the base manifests.yaml.
Proposed Solution & Discussion
Should we refactor the EPP to make the -poolName flag optional?
If the -poolName flag were omitted, the EPP could be responsible for discovering and managing all InferencePool resources that exist within the namespace specified by the required -poolNamespace flag.
This change would:
- Enable a simpler EPP deployment model.
- Allow the conformance suite to use a single shared EPP, significantly simplifying test manifests.
Opening this issue to track the discussion on whether this change aligns with the project's goals.