Skip to content

Conversation

sam-maloney
Copy link
Contributor

Problem: the number of slots decided on by the scheduler is not currently recorded in its allocated R

Add an (optional) nslots key to the execution dict of the Rv1 spec which SHOULD be included in a scheduler allocation

This update documents the design change proposed in flux-framework/flux-core#6632 (specifically this comment)if this idea seems like it would also be feasible to implement in fluxion

@grondo
Copy link
Contributor

grondo commented Aug 19, 2025

This LGTM! Thanks!

We do need to get an OK from @trws and/or @milroy at least to ensure this will work and is reasonable for Fluxion.

@trws
Copy link
Member

trws commented Aug 19, 2025

I do want to have the number of slots in there, but this actually exposes some things we probably need to fix. We need the number of slots, the shape of the slots and the location of them in the end. Looking at the branch using this it looks like slots per node is being calculated assuming the number on each node is constant for example, which is not necessarily true. It also leaves the shell to recalculate that shape. If this is useful in the interim while we figure out how to do that, I'm good with it and fluxion could pretty easily set it, but I'm not sure how much it buys us.

@sam-maloney
Copy link
Contributor Author

I still think this would be useful in the interim, as the restrictions in Jobspec v1 mean that everything else is essentially trivially determined once the number of slots is included, so I believe that e.g. the shell or job-list services would be able to fully determine their necessary information without having to access the jobspec resources array for allocated jobs. Thus, even such a "simple" addition to Rv1 makes several other things much easier, and philosophically helps separate which tasks should be looking for information about the request in the jobspec (for pending jobs) vs tasks that should be using the concrete information in R once the job has been allocated (and not having to parse slot counts from the jobspec in all cases).

In the longer term I completely agree that much more information will need to be included in the resource set in the future. My two cents would be to design a specification for a "canonical" R which could fully represent everything in a concretized canonical jobspec. And if it's still necessary to have a more limited Rv2 as a stepping stone, it would then be defined as a restricted subset of the canonical form, in the same way jobspec v1 is a restricted subset of canonical jobspec.

My initial instinct would be to represent resources in R using exactly the same structure as the canonical jobspec resources array, only with concrete information now specifcied. i.e a node type vertex would now have a nodelist entry, a core or gpu vertex would now have an idset entry (or some universal identifier scheme for all resources). If a simple count is useful, then the count key could still be included only with ranges or idsets also concretized to integers. This way all information decided on by the scheduler should be saved in the allocation, including what resources belong to which slots, and it makes it easy to compare between the initial jobspec request and the final R allocation by having the same structure.

@trws
Copy link
Member

trws commented Aug 26, 2025

That's a fair point. As I said I don't see any problem with it, it's easy enough to supply. @milroy do you have any thoughts here?

@milroy
Copy link
Member

milroy commented Aug 27, 2025

These changes look good to me with respect to the interim need, and I'm fine with this PR as is. I agree with @trws that fluxion can be adapted to support nslots without too much difficulty.

The flexible match policy to be added by this PR: flux-framework/flux-sched#1296 can violate some of the assumptions in flux-framework/flux-core#6632 and will require more information about each slot to enable correct calculation. However, some of that difficulty can be alleviated by regularizing the slot shape by returning a single "best" shape per match. @zekemorton is working on that approach now.

@sam-maloney
Copy link
Contributor Author

One thought I had while updating flux-framework/flux-core#6632 was that it would probably be better to ultimately have this as a mandatory field for a scheduler allocation, i.e. "A conforming scheduler SHALL include this in its allocations" (instead of SHOULD).

It would still be an optional field in general, since not all resource sets are scheduler allocations, but this would mean that once it is implemented in fluxion the flux-core shell init could be further simplified to not have to worry about a future scheduler not setting nslots.

As @garlick said in #402 (comment) "Move RFCs forward, let code catch up" 😁

Problem: the number of slots decided on by the scheduler is not
currently recorded in its allocated R

Add an (optional) nslots key to the execution dict of the Rv1 spec
which SHALL be included in a scheduler allocation
Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good and lets get this merged!

I agree about making the nslots required so we can remove the conditional logic eventually in the job shell.

@mergify mergify bot merged commit 190cc35 into flux-framework:master Aug 28, 2025
7 checks passed
@sam-maloney sam-maloney deleted the rfc20-nslots-in-Rv1 branch August 28, 2025 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants