Skip to content

Conversation

@marquiz
Copy link
Collaborator

@marquiz marquiz commented Jun 10, 2025

Document a preferred setup of the Balloons Policy from the NRI Plugins project.

@marquiz marquiz force-pushed the devel/kubeai-nri branch from 090725a to c3dd144 Compare June 10, 2025 08:55
Copy link
Collaborator

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README is getting quite long now, so I wonder would this make sense as a separate document (linked from README), as it's more about configuring the cluster, than configuring KubeAI?

@marquiz marquiz force-pushed the devel/kubeai-nri branch 2 times, most recently from c6972e0 to 54dcb9d Compare June 11, 2025 07:24
@marquiz marquiz marked this pull request as ready for review June 11, 2025 07:29
@marquiz marquiz requested review from mkbhanda and poussa as code owners June 11, 2025 07:29
@marquiz
Copy link
Collaborator Author

marquiz commented Jun 11, 2025

Updated:

  • marked ready-for-review
  • added support for ollama
  • removed minBalloons and minCPUs settings
  • added (optional) showContainersInNrt plus agent.nodeResourceTopology for debugging/investigation

EDIT: added TOC

@marquiz marquiz changed the title WIP: kubeai: document usage of NRI plugins for performance optimization kubeai: document usage of NRI plugins for performance optimization Jun 11, 2025
@marquiz marquiz force-pushed the devel/kubeai-nri branch from 54dcb9d to 412da85 Compare June 11, 2025 07:35
@poussa
Copy link
Member

poussa commented Jun 11, 2025

README is getting quite long now, so I wonder would this make sense as a separate document (linked from README), as it's more about configuring the cluster, than configuring KubeAI?

Let's keep it in single file. There is room to trim the README to make it more compact.

@marquiz
Copy link
Collaborator Author

marquiz commented Jun 11, 2025

README is getting quite long now, so I wonder would this make sense as a separate document (linked from README), as it's more about configuring the cluster, than configuring KubeAI?

Let's keep it in single file. There is room to trim the README to make it more compact.

I was back-and-forth on this myself. Originally I had a separate document but the decided to put this in the README.

Copy link
Collaborator

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but I still have few suggestions to slightly improve the text.

@marquiz marquiz force-pushed the devel/kubeai-nri branch from 412da85 to 3d75552 Compare June 11, 2025 11:57
Document a preferred setup of the Balloons Policy from the NRI Plugins
project.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
@marquiz marquiz force-pushed the devel/kubeai-nri branch from 3d75552 to c197914 Compare June 11, 2025 11:58
@eero-t eero-t merged commit 907df7c into opea-project:main Jun 11, 2025
5 checks passed
@marquiz marquiz deleted the devel/kubeai-nri branch June 11, 2025 12:26
@askervin
Copy link

@marquiz, I needed to change

key: labels/app.kubernetes.io/name

to

key: pod/labels/app.kubernetes.io/name

when otherwise label match was not working, and the policy resolved vllm container to be put in the default balloon type. Logs looked like this:

I0611 14:16:53.510691       1 log.go:476] D: [           policy           ] choosing balloon type for container default/docsum-vllm-7b787d6cf7-fqwwz/vllm...
I0611 14:16:53.510722       1 log.go:476] D: [           policy           ] - checking expression <labels/app.kubernetes.io/name In vllm,ollama> of balloon type "kubeai-inference" against container default/docsum-vllm-7b787d6cf7-fqwwz/vllm...
I0611 14:16:53.510742       1 log.go:476] D: [           policy           ] - checking expression <name In server> of balloon type "kubeai-inference" against container default/docsum-vllm-7b787d6cf7-fqwwz/vllm...
I0611 14:16:53.510775       1 log.go:476] D: [           policy           ] - namespace "default" matches namespaces of balloon type "default"

When prefixed with "pods/", the expression resulted in:

checking expression <pod/labels/app.kubernetes.io/name In vllm,ollama> of balloon type "kubeai-inference" against container default/docsum-vllm-7b787d6cf7-4nths/vllm...
=> matches

in the log, and NRT looked as expected:

kubectl get noderesourcetopologies.topology.node.k8s.io -o yaml

...
  - attributes:
    - name: cpuset
      value: 86-93,258-265
    - name: shared cpuset
      value: ""
    - name: excess cpus
      value: 0m
    name: kubeai-inference[0]
    resources:
    - allocatable: "342"
      available: "326"
      capacity: "344"
      name: cpu
    type: balloon
  - attributes:
    - name: cpuset
      value: 86-93
    - name: memory set
      value: ""
    name: default/docsum-vllm-7b787d6cf7-4nths/vllm
    parent: kubeai-inference[0]
    resources:
    - allocatable: "16"
      available: "0"
      capacity: "8"
      name: cpu
    type: allocation for container
...

I think this is a usability issue in our expression evaluation. It would be reasonable to expect "labels/" automatically match pod labels. What do you think?

@marquiz
Copy link
Collaborator Author

marquiz commented Jun 11, 2025

when otherwise label match was not working, and the policy resolved vllm container to be put in the default balloon

Interestingly, the matchExpression worksforme for the kubeai workload 🤔

I got (in NRT)

- attributes:
  - name: cpuset
    value: "3"
  - name: memory set
    value: ""
  name: kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server
  parent: kubeai-inference[0]

In the logs I see

] allocating resources for container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server (request 1000 mCPU, limit 0 mCPU)...
I0611 15:47:52.906534       1 log.go:476] D: [           policy           ] choosing balloon type for container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server...
I0611 15:47:52.906543       1 log.go:476] D: [           policy           ] - checking expression <labels/app.kubernetes.io/name In vllm,ollama> of balloon type "kubeai-inference" against container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server...
I0611 15:47:52.906551       1 log.go:476] D: [           policy           ] - checking expression <name In server> of balloon type "kubeai-inference" against container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server...
I0611 15:47:52.906553       1 log.go:476] D: [           policy           ]   => matches

So it put the container in the balloon even though all matchExpressions did not match. I think that's a bug (or unwanted feature at least). I'd assume the matchExpression works similarly to Kubernetes where all expressions are ANDed. WDYT?

@marquiz
Copy link
Collaborator Author

marquiz commented Jun 11, 2025

It would be reasonable to expect "labels/" automatically match pod labels. What do you think?

Not so sure about this as there are also container labels (in CRI/NRI) level. I'd leave that as is and possibly only note that caveat in the documentation.

@marquiz
Copy link
Collaborator Author

marquiz commented Jun 11, 2025

key: pod/labels/app.kubernetes.io/name

See #1115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants