Skip to content

Refactor EPP to remove -poolName requirement and support multiple pools per namespace #946

@SinaChavoshi

Description

@SinaChavoshi

Summary

The EPP (End Point Picker) application currently requires a -poolName command-line argument at startup, which ties a single EPP process to a single InferencePool resource. This prevents a simpler architecture where one shared EPP deployment could serve all InferencePools within a given namespace.

Current Behavior
As discovered while debugging the HTTPRouteMultipleRulesDifferentPools (see #834 ) conformance test, the EPP application will not start without the -poolName flag. The pod logs show a fatal error on startup:

{"error": "required \"poolName\" flag not set"}

This requirement is enforced by the validateFlags function in cmd/main.go. Additionally the EPP requires a health check component that is currently 1-1 with the an inference pool.

Impact

This design has two main impacts:

  • It prevents users from deploying a single, shared EPP instance to handle routing for an entire namespace of InferencePools.

  • It complicates the conformance test setup. For example, the HTTPRouteMultipleRulesDifferentPools test requires two separate EPP Deployments and Services to be defined in its manifest because it tests two distinct InferencePools. This adds boilerplate and prevents us from defining a single shared EPP in the base manifests.yaml.

Proposed Solution & Discussion

Should we refactor the EPP to make the -poolName flag optional?

If the -poolName flag were omitted, the EPP could be responsible for discovering and managing all InferencePool resources that exist within the namespace specified by the required -poolNamespace flag.

This change would:

  • Enable a simpler EPP deployment model.
  • Allow the conformance suite to use a single shared EPP, significantly simplifying test manifests.

Opening this issue to track the discussion on whether this change aligns with the project's goals.

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions