-
Notifications
You must be signed in to change notification settings - Fork 179
Add a set of configuration defaults #1223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
…add one Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hi @shmuelk. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
ok-to-test |
The MaxScorePicker should behave as the RandomPicker when there are no scores, I recommend making it that way if that is not the case now; with this, we would always default to MaxScorePicker and remove the RandomPicker. |
This PR is working with the pickers as they are. I agree with your idea, I just think it needs to be done in a separate PR, which is a breaking API/UX PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
||
thePlugins := []SchedulingPlugin{} | ||
for _, pluginConfig := range cfg.Plugins { | ||
thePlugins = append(thePlugins, SchedulingPlugin{PluginRef: pluginConfig.Name}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we only add the plugins that are applicable allowed in a scheduling profile? scorer, filter, picker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in time in the code, one doesn't know what the plugins are as they have not yet been instantiated.
If you want, I can move this code to the LoadSchedulingConfig function, with appropriate changes to the configuration validation code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend consolidating validation in this pkg, this is the practice, validation is invoked after defaulting automatically. We can do that in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the code in ScheculingProfile.AddPlugins, ignores any plugins that are of types other than filter, scorer, picker, or postCycle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but when we default and log the config, they will show up, silently ignoring them is not ideal.
if scorer, ok := referencedPlugin.(framework.Scorer); ok { | ||
if plugin.Weight == nil { | ||
return nil, fmt.Errorf("scorer '%s' is missing a weight", plugin.PluginRef) | ||
// Set default weight to 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend consolidating all defaulting in the defaults.go for two reasons: 1) it is harder to reason about the default behavior and debug it when it is scattered 2) logging the actual configuration to be applied become easier and the log at https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/cmd/epp/runner/runner.go#L407 would show all the defaults we apply
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the number one to a constant. At the point in time in the code where there is logging, the addition of the profile handler and picker are not yet done, In addition the setting of the default weight for scorers also happens latter on.
I can change the code to modify the structs of the read in configuration and move the logging a bit, if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not about the weight only, I am referring to the whole logic we added here, including adding a default picker and default handler. I think all that should be done in the defaulting file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there are two defaults.go files in the repo, I want to make sure we are talking about the same file. I assume you are talking about defaults.go in the package apix/config/v1alpha1.
If I am correct, then:
- This code is automatically invoked by the K8S machinery that loads the YAML
- In order to do the Scorer weight defaulting one needs to know that the plugin is a scorer, similarly to know that a profile handler or a picker were not specified, one needs to instantiate the plugins to check their types.
- Similarly, when creating a default SchedulerProfile, with only Scheduler plugins, one needs to instantiate them to know their types.
- This would require duplicating the code that instantiates plugins. The real place for such code is not inside the package apix/config/v1alpha1. If it lives elsewhere, as it should, using it in the package apix/config/v1alpha1, will likely cause an import loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was referring to the defaulting code there.
ok, I guess we need to have the registry expose interfaces to do the needed checks and somehow plumb the registry in, this is a much bigger change that we need t consider separately.
I am ok with this PR, but can you please remove the random picker and converge on the max score one as suggested in another comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I updated the code, tests, doc, and description of this PR
Looking at the code, I think it does that already; @nirrozenbaum can you confirm? |
/ok-to-test |
@ahg-g yes, max score picker is choosing randomly when all scores are equal. |
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Great, then I recommend defaulting to max score always. |
@shmuelk not related to this PR but related to config api - handle := plugins.NewEppHandle(ctx)
config, err := loader.LoadConfig(configBytes, handle)
if err != nil {
return fmt.Errorf("failed to load the configuration - %w", err)
}
setupLog.Info("Configuration file loaded", "config", config)
r.schedulerConfig, err = loader.LoadSchedulerConfig(config.SchedulingProfiles, handle)
if err != nil {
return fmt.Errorf("failed to create Scheduler configuration - %w", err)
} the thing is, we don't need the v1alpha1.EndpointPickerConfig. it's only used to load scheduler config. func LoadConfig(configBytes, plugins.NewEppHandle(ctx)) (SchedulerConfig, plugins, error). I'm aiming for returning SchedulerConfig and Plugins as this is correlated to the config file structure. @ahg-g as a side note - this line |
@nirrozenbaum I disagree with hiding the loaded config. The overall plan for the configuration was to start with the scheduler and to continue on to other parts of the system. |
that's the concept of encapsulation. type Config struct {
SchedulerConfig scheduling.SchedulerConfig
Plugins []plugins.plugin
..
// add here more fields in the future
..
} and then the function signature can be: func LoadConfig(configBytes, plugins.NewEppHandle(ctx)) (Config, error) the main point I'm trying to make is that the |
ok, I think it is still useful to also log using the structure the deployer is familiar with, which is the config api |
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
/retest |
Can we simplify the configurations in the helm charts: https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/charts/inferencepool/templates/epp-config.yaml |
@ahg-g yes, we can remove all fields that use defaults. I would do in a separate PR to scope this one for adding the default options that were discussed, and a separate PR which changes config and should be done very carefully. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, shmuelk The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* If a picker wasn't specified, add one Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * If there is only one profile and a profile handler wasn't specified, add one Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * If there aren't any ScehdulingProfiles, add one Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updated and fixed tests Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updated the configuration guide Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Don't hard code the default scorer weight Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * If a picker is missing, always add a MaxScorePicker Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
This PR adds a set of defaults to the text based configuration, including:
default
with references to all of the plugins referenced in the plugins section.With the above changes the following configuration serves as an example of a minimalistic configuration:
which will be equivalent to having specified: