An extensible capacity signal for InferenceClusters

### What problem are you facing?

An InferenceCluster reports capacity from its configured pool size, not from what can actually run right now. Different GPU schedulers track real availability in their own state: KAI in its queues and resource pools, Kueue in cluster-queue flavor usage, Volcano in its queues. There is no way to feed any of that into Modelplane's capacity view, and no single shape a reader can rely on without knowing which scheduler produced the numbers. An operator running one of these schedulers cannot make reported capacity reflect what their cluster will actually admit, and anyone who wants a richer signal has nowhere to plug one in.

### How could Modelplane help solve your problem?

Modelplane should report capacity in one generic shape, filled by a default source that works on any cluster from kube-native state (node allocatable minus what pods request), and let a scheduler-specific source refine that signal as an opt-in selected on the InferenceCluster. The default and any scheduler-specific source produce the same shape, so whatever reads capacity does not need to know how it was measured.

Modelplane should not carry hard-coded knowledge of each scheduler. The capacity source is something an operator chooses and the project can add to over time, following the same generic-shape-plus-opt-in-adapter pattern as #68. Any scheduling decision that reads capacity is a consumer of this signal, but the signal stands on its own as a cluster's report of what it can actually run.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

An extensible capacity signal for InferenceClusters #70

What problem are you facing?

How could Modelplane help solve your problem?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

An extensible capacity signal for InferenceClusters #70

Description

What problem are you facing?

How could Modelplane help solve your problem?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions