-
Notifications
You must be signed in to change notification settings - Fork 99
kubeai: document usage of NRI plugins for performance optimization #1113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
090725a to
c3dd144
Compare
eero-t
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README is getting quite long now, so I wonder would this make sense as a separate document (linked from README), as it's more about configuring the cluster, than configuring KubeAI?
c6972e0 to
54dcb9d
Compare
|
Updated:
EDIT: added TOC |
54dcb9d to
412da85
Compare
Let's keep it in single file. There is room to trim the README to make it more compact. |
I was back-and-forth on this myself. Originally I had a separate document but the decided to put this in the README. |
eero-t
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but I still have few suggestions to slightly improve the text.
412da85 to
3d75552
Compare
Document a preferred setup of the Balloons Policy from the NRI Plugins project. Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
3d75552 to
c197914
Compare
|
@marquiz, I needed to change to when otherwise label match was not working, and the policy resolved vllm container to be put in the default balloon type. Logs looked like this: When prefixed with "pods/", the expression resulted in: in the log, and NRT looked as expected: kubectl get noderesourcetopologies.topology.node.k8s.io -o yaml I think this is a usability issue in our expression evaluation. It would be reasonable to expect "labels/" automatically match pod labels. What do you think? |
Interestingly, the matchExpression worksforme for the kubeai workload 🤔 I got (in NRT) - attributes:
- name: cpuset
value: "3"
- name: memory set
value: ""
name: kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server
parent: kubeai-inference[0]In the logs I see ] allocating resources for container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server (request 1000 mCPU, limit 0 mCPU)...
I0611 15:47:52.906534 1 log.go:476] D: [ policy ] choosing balloon type for container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server...
I0611 15:47:52.906543 1 log.go:476] D: [ policy ] - checking expression <labels/app.kubernetes.io/name In vllm,ollama> of balloon type "kubeai-inference" against container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server...
I0611 15:47:52.906551 1 log.go:476] D: [ policy ] - checking expression <name In server> of balloon type "kubeai-inference" against container kubeai/model-qwen2.5-0.5b-cpu-5549fbccc5-cjq2f/server...
I0611 15:47:52.906553 1 log.go:476] D: [ policy ] => matchesSo it put the container in the balloon even though all matchExpressions did not match. I think that's a bug (or unwanted feature at least). I'd assume the matchExpression works similarly to Kubernetes where all expressions are ANDed. WDYT? |
Not so sure about this as there are also container labels (in CRI/NRI) level. I'd leave that as is and possibly only note that caveat in the documentation. |
See #1115 |
Document a preferred setup of the Balloons Policy from the NRI Plugins project.