-
Notifications
You must be signed in to change notification settings - Fork 41.5k
Description
What would you like to be added?
In composable system, it is necessary to consider the optimal design of the composable DRA driver for managing fabric devices and the vendor DRA driver for managing node-local devices.
According to KEP-5007 (kubernetes/enhancements#5012), especially (kubernetes/enhancements#5012 (comment)), there are three ideas:
(If I've missed something, please let me know.)
1. Moving the device and update ResourceSlice
- Basic concept:
- Advantages: It is possible to minimize the addition of features to the vendor DRA
- Problems: It is necessary to consider conflicts due to moves and updates
- Vendor DRA driver needs to be newly implemented: Only periodic rescan
2. Implement a device autoscaler in ClusterAutoscaler
- Basic concept:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/5007-device-attach-before-pod-scheduled#alternative-approach - Advantages: The scheduler does not need to reschedule pods.
- Problems: Requires major addition of functionality in CA. May not fit with CA concept (CA scales nodes horizontally)
- New implementation required in vendor DRA driver: Periodic rescan only
3. Make vendor DRA driver aware of fabric devices
- Basic concept: KEP-5007: DRA Device Binding Conditions enhancements#5012 (comment)
- Advantages: The scheduler does not need to reschedule pods. Can use happy path (can proceed directly to binding after device attachment). It is possible to avoid moving devices between ResourceSlices.
- Problem: A large feature needs to be added to the vendor DRA.
- New implementations required for the vendor DRA driver:
- Fabric device recognition
- Composable-related IFs such as attach,
- Updating ResourceSlices after attach
- Updating BindingConditions in ResourceClaim
- Synchronization with vendor DRA drivers on other nodes, etc.
I think idea 1 is good, but I would like to hear from DRA experts on which of these ideas is better, or if there are any better ideas.
/cc @pohly
/cc @klueska
/cc @johnbelamaric
/cc @KobayashiD27
/sig node
Why is this needed?
Composable disaggregated infrastructured needs it for GPUs connected to a node on demand:
https://kccnceu2024.sched.com/event/1ZPDw/iown-bof-challenges-of-kubernetes-for-composable-disaggregated-computing-naoki-oguchi-fujitsu-hidetsugu-sugiyama-red-hat-clara-li-intel-ryosuke-kurebayashi-ntt