Skip to content

Commit

Permalink
Update open questions
Browse files Browse the repository at this point in the history
- Some formatting fixes
- Addressed one of the naming concerns
  • Loading branch information
johnbelamaric committed May 9, 2024
1 parent cee2fa0 commit 3112224
Showing 1 changed file with 30 additions and 12 deletions.
42 changes: 30 additions & 12 deletions k8srm-prototype/open-questions.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,31 @@
them:
- https://github.com/kubernetes-sigs/wg-device-management/pull/5/files#r1587245528
- https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591620848
- Are most cases going to use some form of `SharedResources` or do we think there are lots of cases where it's just "whole PCIe" card?
- Are we trying to come up with a new abstraction that is already part of the PCIe structure of functions? Can we model based on that instead?
- If we have to do things like "pf-0-vfs" and "memory-slice-0", are we missing some fundamental part of the model around shared resources - like groups of shared resources a la Kevin's named model extension (I thought that was too complex...)?
- Where it gets weird though is when a vendor has a mix of card types to support.
- Are most cases going to use some form of `SharedResources` or do we think
there are lots of cases where it's just "whole PCIe" card?
- Are we trying to come up with a new abstraction that is already part of the
PCIe structure of functions? Can we model based on that instead?
- If we have to do things like "pf-0-vfs" and "memory-slice-0", are we missing
some fundamental part of the model around shared resources - like groups of
shared resources a la Kevin's named model extension (I thought that was too
complex...)?
- Where it gets weird though is when a vendor has a mix of card types to
support.
* Where does one draw the boundaries around each `DevicePool`?
* Do they put all of their simple devices in a single `DevicePool` alongside all partitionable devices in their own individual `DevicePool`s?
* Why is there a "pool" boundary at all if there is no real meaning tied to it?
* In the case of simple devices, is it only there so as to avoid having lots of separate `Device` objects in the API server?
* If we do want to support both simple and partitionable devices in a single `DevicePool`, do we need to add one level of embedding as @thockin proposed? An alternative of this is to go back to my concept of having a list of named `SharedResourceGroups` (as opposed to a single `SharedResource` section), forcing individual devices to refer back to the name of the `SharedResourceGroup` they pull a given shared resource from.
* What happens as the size of these device pools grow and we hit the limit of a single API server object? Where do we draw the boundary then?
* Do they put all of their simple devices in a single `DevicePool` alongside
all partitionable devices in their own individual `DevicePool`s?
* Why is there a "pool" boundary at all if there is no real meaning tied to
it?
* In the case of simple devices, is it only there so as to avoid having lots
of separate `Device` objects in the API server?
* If we do want to support both simple and partitionable devices in a single
`DevicePool`, do we need to add one level of embedding as @thockin
proposed? An alternative of this is to go back to my concept of having a
list of named `SharedResourceGroups` (as opposed to a single
`SharedResource` section), forcing individual devices to refer back to the
name of the `SharedResourceGroup` they pull a given shared resource from.
* What happens as the size of these device pools grow and we hit the limit
of a single API server object? Where do we draw the boundary then?
- Do we need to deal with cross-pod (and cross-node) "linked" claims ala this
[discussion](https://github.com/kubernetes-sigs/wg-device-management/pull/5#pullrequestreview-2035165945)?
Or can that be handled by a higher-level workload controller that understands
Expand All @@ -60,9 +75,12 @@
See this
[discussion](https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591621730).
- Capacity model naming around shared resource consumption and claim resources
provided is not great. See
[here](https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591623761)
and [here](https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591623874).
provided is not great.
- ~See
[here](https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591623761)~
(addressed)
- See
[here](https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591623874)
- Having DevicePoolName in the claim status field makes pools immutable which is
bad. We need a different solution, maybe device UUIDs? See this
[discussion](https://github.com/kubernetes-sigs/wg-device-management/pull/5#discussion_r1591664614).
Expand Down

0 comments on commit 3112224

Please sign in to comment.