Skip to content

Conversation

shmuelk
Copy link
Contributor

@shmuelk shmuelk commented Jul 23, 2025

This PR adds a set of defaults to the text based configuration, including:

  1. If no SchedulingProfiles were specified, a default profile is added, named default with references to all of the plugins referenced in the plugins section.
  2. If a profile handler wasn't specified and only one profile was specified, or defaulted, the SingleProfileHandler is added.
  3. If a picker wasn't specified, then the MaxScorePicker will be added to the profile.
  4. If the weight is left out of a reference to a scorer, a weight of one, will be specified.

With the above changes the following configuration serves as an example of a minimalistic configuration:

apiVersion: inference.networking.x-k8s.io/v1alpha1
kind: EndpointPickerConfig
plugins:
- type: prefix-cache-scorer
  parameters:
    hashBlockSize: 32

which will be equivalent to having specified:

apiVersion: inference.networking.x-k8s.io/v1alpha1
kind: EndpointPickerConfig
plugins:
- type: single-profile-handler
- type: prefix-cache-scorer
  parameters:
    hashBlockSize: 32
- type: max-score-picker
schedulingProfiles:
- name: default
  plugins:
  - pluginRef: prefix-cache-scorer
    weight: 1
  - pluginRef: max-score-picker

shmuelk added 5 commits July 23, 2025 16:37
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
…add one

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Copy link

netlify bot commented Jul 23, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 94b39bf
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68826d937e378f00088b5e00
😎 Deploy Preview https://deploy-preview-1223--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 23, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 23, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @shmuelk. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 23, 2025
@ahg-g
Copy link
Contributor

ahg-g commented Jul 23, 2025

ok-to-test

@ahg-g
Copy link
Contributor

ahg-g commented Jul 23, 2025

If the set of referenced plugins includes one or more scorers, the MaxScorePicker will be added to the profile
Otherwise, the RandomPicker will be added to the profile.

The MaxScorePicker should behave as the RandomPicker when there are no scores, I recommend making it that way if that is not the case now; with this, we would always default to MaxScorePicker and remove the RandomPicker.

@shmuelk
Copy link
Contributor Author

shmuelk commented Jul 23, 2025

If the set of referenced plugins includes one or more scorers, the MaxScorePicker will be added to the profile
Otherwise, the RandomPicker will be added to the profile.

The MaxScorePicker should behave as the RandomPicker when there are no scores, I recommend making it that way if that >is not the case now; with this, we would always default to MaxScorePicker and remove the RandomPicker.

This PR is working with the pickers as they are. I agree with your idea, I just think it needs to be done in a separate PR, which is a breaking API/UX PR.

Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


thePlugins := []SchedulingPlugin{}
for _, pluginConfig := range cfg.Plugins {
thePlugins = append(thePlugins, SchedulingPlugin{PluginRef: pluginConfig.Name})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we only add the plugins that are applicable allowed in a scheduling profile? scorer, filter, picker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in time in the code, one doesn't know what the plugins are as they have not yet been instantiated.

If you want, I can move this code to the LoadSchedulingConfig function, with appropriate changes to the configuration validation code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend consolidating validation in this pkg, this is the practice, validation is invoked after defaulting automatically. We can do that in a followup PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the code in ScheculingProfile.AddPlugins, ignores any plugins that are of types other than filter, scorer, picker, or postCycle

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but when we default and log the config, they will show up, silently ignoring them is not ideal.

if scorer, ok := referencedPlugin.(framework.Scorer); ok {
if plugin.Weight == nil {
return nil, fmt.Errorf("scorer '%s' is missing a weight", plugin.PluginRef)
// Set default weight to 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly recommend consolidating all defaulting in the defaults.go for two reasons: 1) it is harder to reason about the default behavior and debug it when it is scattered 2) logging the actual configuration to be applied become easier and the log at https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/cmd/epp/runner/runner.go#L407 would show all the defaults we apply

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the number one to a constant. At the point in time in the code where there is logging, the addition of the profile handler and picker are not yet done, In addition the setting of the default weight for scorers also happens latter on.

I can change the code to modify the structs of the read in configuration and move the logging a bit, if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not about the weight only, I am referring to the whole logic we added here, including adding a default picker and default handler. I think all that should be done in the defaulting file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there are two defaults.go files in the repo, I want to make sure we are talking about the same file. I assume you are talking about defaults.go in the package apix/config/v1alpha1.

If I am correct, then:

  1. This code is automatically invoked by the K8S machinery that loads the YAML
  2. In order to do the Scorer weight defaulting one needs to know that the plugin is a scorer, similarly to know that a profile handler or a picker were not specified, one needs to instantiate the plugins to check their types.
  3. Similarly, when creating a default SchedulerProfile, with only Scheduler plugins, one needs to instantiate them to know their types.
  4. This would require duplicating the code that instantiates plugins. The real place for such code is not inside the package apix/config/v1alpha1. If it lives elsewhere, as it should, using it in the package apix/config/v1alpha1, will likely cause an import loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was referring to the defaulting code there.

ok, I guess we need to have the registry expose interfaces to do the needed checks and somehow plumb the registry in, this is a much bigger change that we need t consider separately.

I am ok with this PR, but can you please remove the random picker and converge on the max score one as suggested in another comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I updated the code, tests, doc, and description of this PR

@ahg-g
Copy link
Contributor

ahg-g commented Jul 23, 2025

If the set of referenced plugins includes one or more scorers, the MaxScorePicker will be added to the profile
Otherwise, the RandomPicker will be added to the profile.

The MaxScorePicker should behave as the RandomPicker when there are no scores, I recommend making it that way if that >is not the case now; with this, we would always default to MaxScorePicker and remove the RandomPicker.

This PR is working with the pickers as they are. I agree with your idea, I just think it needs to be done in a separate PR, which is a breaking API/UX PR.

Looking at the code, I think it does that already; @nirrozenbaum can you confirm?

@ahg-g
Copy link
Contributor

ahg-g commented Jul 23, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 23, 2025
@nirrozenbaum
Copy link
Contributor

nirrozenbaum commented Jul 23, 2025

If the set of referenced plugins includes one or more scorers, the MaxScorePicker will be added to the profile
Otherwise, the RandomPicker will be added to the profile.

The MaxScorePicker should behave as the RandomPicker when there are no scores, I recommend making it that way if that >is not the case now; with this, we would always default to MaxScorePicker and remove the RandomPicker.

This PR is working with the pickers as they are. I agree with your idea, I just think it needs to be done in a separate PR, which is a breaking API/UX PR.

Looking at the code, I think it does that already; @nirrozenbaum can you confirm?

@ahg-g yes, max score picker is choosing randomly when all scores are equal.
I wouldn't remove random picker though, it's a very useful picker for benchmarks.

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
@ahg-g
Copy link
Contributor

ahg-g commented Jul 23, 2025

@ahg-g yes, max score picker is choosing randomly when all scores are equal.
I wouldn't remove random picker though, it's a very useful picker for benchmarks.

Great, then I recommend defaulting to max score always.

@nirrozenbaum
Copy link
Contributor

@shmuelk not related to this PR but related to config api -
can we make only one public function?
today we have two public functions and the calls in runner look like this:

handle := plugins.NewEppHandle(ctx)
config, err := loader.LoadConfig(configBytes, handle)
if err != nil {
return fmt.Errorf("failed to load the configuration - %w", err)
}

setupLog.Info("Configuration file loaded", "config", config)

r.schedulerConfig, err = loader.LoadSchedulerConfig(config.SchedulingProfiles, handle)
if err != nil {
return fmt.Errorf("failed to create Scheduler configuration - %w", err)
}

the thing is, we don't need the v1alpha1.EndpointPickerConfig. it's only used to load scheduler config.
so I would expect something like:

func LoadConfig(configBytes, plugins.NewEppHandle(ctx)) (SchedulerConfig, plugins, error).

I'm aiming for returning SchedulerConfig and Plugins as this is correlated to the config file structure.
wdyt? (no need to include in this PR, can be done in follow up)

@ahg-g as a side note - this line setupLog.Info("Configuration file loaded", "config", config) is not logging the config.. when I checked I saw in the log "config", "*v1alpha1.EndpointPickerConfig{}".
logging the config can be internally inside the load function or alternatively we can log the returned schedulerConfig + plugins which I think is a better option (we would like to log the actual configuration).
logging can be handled separately in a different issue.

@shmuelk
Copy link
Contributor Author

shmuelk commented Jul 24, 2025

@nirrozenbaum I disagree with hiding the loaded config. The overall plan for the configuration was to start with the scheduler and to continue on to other parts of the system.

@nirrozenbaum
Copy link
Contributor

@nirrozenbaum I disagree with hiding the loaded config. The overall plan for the configuration was to start with the scheduler and to continue on to other parts of the system.

that's the concept of encapsulation.
for adding more fields in the future we can define a Config struct that holds the parsed configuration, e.g.,

type Config struct {
    SchedulerConfig scheduling.SchedulerConfig
    Plugins []plugins.plugin
    ..
    // add here more fields in the future 
    ..
}

and then the function signature can be:

func LoadConfig(configBytes, plugins.NewEppHandle(ctx)) (Config, error)

the main point I'm trying to make is that the v1alpha1.EndpointPickerConfig is an intermediate parsed object that can be internal to the config package as we don't care about it outside of that package (it's an implementation detail of the package). we care only about the final configuration and not about the intermediate objects.

@ahg-g
Copy link
Contributor

ahg-g commented Jul 24, 2025

@ahg-g as a side note - this line setupLog.Info("Configuration file loaded", "config", config) is not logging the config.. when I checked I saw in the log "config", "*v1alpha1.EndpointPickerConfig{}".
logging the config can be internally inside the load function or alternatively we can log the returned schedulerConfig + plugins which I think is a better option (we would like to log the actual configuration).
logging can be handled separately in a different issue.

ok, I think it is still useful to also log using the structure the deployer is familiar with, which is the config api

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
@nirrozenbaum
Copy link
Contributor

/retest

@ahg-g
Copy link
Contributor

ahg-g commented Jul 24, 2025

@nirrozenbaum
Copy link
Contributor

Can we simplify the configurations in the helm charts: https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/charts/inferencepool/templates/epp-config.yaml

@ahg-g yes, we can remove all fields that use defaults. I would do in a separate PR to scope this one for adding the default options that were discussed, and a separate PR which changes config and should be done very carefully.

@ahg-g
Copy link
Contributor

ahg-g commented Jul 25, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 25, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, shmuelk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 25, 2025
@k8s-ci-robot k8s-ci-robot merged commit 0984c50 into kubernetes-sigs:main Jul 25, 2025
9 checks passed
kfswain pushed a commit to kfswain/llm-instance-gateway that referenced this pull request Jul 31, 2025
* If a picker wasn't specified, add one

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* If there is only one profile and a profile handler wasn't specified, add one

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* If there aren't any ScehdulingProfiles, add one

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updated and fixed tests

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updated the configuration guide

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Don't hard code the default scorer weight

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* If a picker is missing, always add a MaxScorePicker

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

---------

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
@shmuelk shmuelk deleted the config-defaults branch August 4, 2025 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants