Skip to content

Commit c4f58f1

Browse files
kaushikmitrBenjaminBraunDev
authored andcommitted
seperate servers for training and prediction
Add APIs for the instantiated plugins to the EPP Handle (kubernetes-sigs#1039) * Added plugin instance APIs to plugins.Handle Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * An implementation of the new plugins.Handle APIs Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Moved all configuration loading code to new package Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to new and moved APIs Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Cleanup of old configuration loading code Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> chore(deps): bump the kubernetes group with 6 updates (kubernetes-sigs#1050) Bumps the kubernetes group with 6 updates: | Package | From | To | | --- | --- | --- | | [k8s.io/api](https://github.com/kubernetes/api) | `0.33.1` | `0.33.2` | | [k8s.io/apiextensions-apiserver](https://github.com/kubernetes/apiextensions-apiserver) | `0.33.1` | `0.33.2` | | [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) | `0.33.1` | `0.33.2` | | [k8s.io/client-go](https://github.com/kubernetes/client-go) | `0.33.1` | `0.33.2` | | [k8s.io/code-generator](https://github.com/kubernetes/code-generator) | `0.33.1` | `0.33.2` | | [k8s.io/component-base](https://github.com/kubernetes/component-base) | `0.33.1` | `0.33.2` | Updates `k8s.io/api` from 0.33.1 to 0.33.2 - [Commits](kubernetes/api@v0.33.1...v0.33.2) Updates `k8s.io/apiextensions-apiserver` from 0.33.1 to 0.33.2 - [Release notes](https://github.com/kubernetes/apiextensions-apiserver/releases) - [Commits](kubernetes/apiextensions-apiserver@v0.33.1...v0.33.2) Updates `k8s.io/apimachinery` from 0.33.1 to 0.33.2 - [Commits](kubernetes/apimachinery@v0.33.1...v0.33.2) Updates `k8s.io/client-go` from 0.33.1 to 0.33.2 - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.33.1...v0.33.2) Updates `k8s.io/code-generator` from 0.33.1 to 0.33.2 - [Commits](kubernetes/code-generator@v0.33.1...v0.33.2) Updates `k8s.io/component-base` from 0.33.1 to 0.33.2 - [Commits](kubernetes/component-base@v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: k8s.io/api dependency-version: 0.33.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/apiextensions-apiserver dependency-version: 0.33.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/apimachinery dependency-version: 0.33.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/client-go dependency-version: 0.33.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/code-generator dependency-version: 0.33.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/component-base dependency-version: 0.33.2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> remove datastore dependency from the scheduler (kubernetes-sigs#1049) * remove datastore dependency from the scheduler Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * added back comments on snapshotting pods from datastore before calling schedule Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * removed fake datastore from conformance scheduler test Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> Add subsetting logic for epp (kubernetes-sigs#981) feat: Added a factory function for the DecisionTree filter (kubernetes-sigs#1053) * Added a factory function for the DecisionTreeFilter Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added tests of the factory function of the DecisionTreeFilter Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Registered the factory function of the DecisionTreeFilter Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Refactored the configuration loading Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Adding pprof endpoints to metrics port (kubernetes-sigs#1069) feat: Add a context.Context to the plugins.HAndle interface (kubernetes-sigs#1076) * Added a context.Context to the plugins.Handle interface Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Changes due to changes in internal APIs Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Changes to tests due to changes in internal APIs Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> convert subset filter from a plugin to logic in director (kubernetes-sigs#1088) * convert subset filter from a plugin to logic in director Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * replace interface{} with any Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make linter happy Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * address code review comments Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> chore(deps): bump golang.org/x/sync from 0.14.0 to 0.15.0 (kubernetes-sigs#1096) Bumps [golang.org/x/sync](https://github.com/golang/sync) from 0.14.0 to 0.15.0. - [Commits](golang/sync@v0.14.0...v0.15.0) --- updated-dependencies: - dependency-name: golang.org/x/sync dependency-version: 0.15.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Introduce plugins.TypedName to be used for Plugin base implementation (kubernetes-sigs#1086) * introduce TypedName to reduce boilerplate, modify plugins Signed-off-by: Etai Lev Ran <elevran@gmail.com> * implement GetTypedName() Signed-off-by: Etai Lev Ran <elevran@gmail.com> * Remove Type() and Name() from Plugin interface Signed-off-by: Etai Lev Ran <elevran@gmail.com> * use TypedName as private field, not embedded Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com> move the conversion from pod metrics to scheduler pod representation one level up (kubernetes-sigs#1104) * move the converstion from pod metrics to scheduler pod representation one level up Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * minor change in helper func Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> handle picking multiple destinations in scheduling layer (kubernetes-sigs#1059) * implement multiple destination as the output of the scheduler Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * updated max score picker unit tests to cover multiple pods Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * imports Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * unit-test fix Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> refactor: 🔨 use the more explicit singular form (kubernetes-sigs#1129)
1 parent b4c2f8c commit c4f58f1

38 files changed

+6758
-187
lines changed

cmd/epp/runner/register.go

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
/*
2+
Copyright 2025 The Kubernetes Authors.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
package runner
18+
19+
import (
20+
"context"
21+
22+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/plugins"
23+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/framework/plugins/filter"
24+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/framework/plugins/multi/prefix"
25+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/framework/plugins/picker"
26+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/framework/plugins/profile"
27+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/scheduling/framework/plugins/scorer"
28+
)
29+
30+
// RegisterAllPlugins registers the factory functions of all known plugins
31+
func RegisterAllPlugins() {
32+
plugins.Register(filter.DecisionTreeFilterType, filter.DecisionTreeFilterFactory)
33+
plugins.Register(filter.LeastKVCacheFilterType, filter.LeastKVCacheFilterFactory)
34+
plugins.Register(filter.LeastQueueFilterType, filter.LeastQueueFilterFactory)
35+
plugins.Register(filter.LoraAffinityFilterType, filter.LoraAffinityFilterFactory)
36+
plugins.Register(filter.LowQueueFilterType, filter.LowQueueFilterFactory)
37+
plugins.Register(prefix.PrefixCachePluginType, prefix.PrefixCachePluginFactory)
38+
plugins.Register(picker.MaxScorePickerType, picker.MaxScorePickerFactory)
39+
plugins.Register(picker.RandomPickerType, picker.RandomPickerFactory)
40+
plugins.Register(profile.SingleProfileHandlerType, profile.SingleProfileHandlerFactory)
41+
plugins.Register(scorer.KvCacheScorerType, scorer.KvCacheScorerFactory)
42+
plugins.Register(scorer.QueueScorerType, scorer.QueueScorerFactory)
43+
}
44+
45+
// eppHandle is an implementation of the interface plugins.Handle
46+
type eppHandle struct {
47+
ctx context.Context
48+
plugins plugins.HandlePlugins
49+
}
50+
51+
// Context returns a context the plugins can use, if they need one
52+
func (h *eppHandle) Context() context.Context {
53+
return h.ctx
54+
}
55+
56+
// Plugins returns the sub-handle for working with instantiated plugins
57+
func (h *eppHandle) Plugins() plugins.HandlePlugins {
58+
return h.plugins
59+
}
60+
61+
// eppHandlePlugins implements the set of APIs to work with instantiated plugins
62+
type eppHandlePlugins struct {
63+
thePlugins map[string]plugins.Plugin
64+
}
65+
66+
// Plugin returns the named plugin instance
67+
func (h *eppHandlePlugins) Plugin(name string) plugins.Plugin {
68+
return h.thePlugins[name]
69+
}
70+
71+
// AddPlugin adds a plugin to the set of known plugin instances
72+
func (h *eppHandlePlugins) AddPlugin(name string, plugin plugins.Plugin) {
73+
h.thePlugins[name] = plugin
74+
}
75+
76+
// GetAllPlugins returns all of the known plugins
77+
func (h *eppHandlePlugins) GetAllPlugins() []plugins.Plugin {
78+
result := make([]plugins.Plugin, 0)
79+
for _, plugin := range h.thePlugins {
80+
result = append(result, plugin)
81+
}
82+
return result
83+
}
84+
85+
// GetAllPluginsWithNames returns al of the known plugins with their names
86+
func (h *eppHandlePlugins) GetAllPluginsWithNames() map[string]plugins.Plugin {
87+
return h.thePlugins
88+
}
89+
90+
func newEppHandle(ctx context.Context) *eppHandle {
91+
return &eppHandle{
92+
ctx: ctx,
93+
plugins: &eppHandlePlugins{
94+
thePlugins: map[string]plugins.Plugin{},
95+
},
96+
}
97+
}

cmd/epp/runner/runner.go

Lines changed: 73 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ import (
4545
"sigs.k8s.io/gateway-api-inference-extension/internal/runnable"
4646
"sigs.k8s.io/gateway-api-inference-extension/pkg/common"
4747
backendmetrics "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/backend/metrics"
48-
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/config/loader"
48+
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/common/config/loader"
4949
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/datalayer"
5050
dlmetrics "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/datalayer/metrics"
5151
"sigs.k8s.io/gateway-api-inference-extension/pkg/epp/datastore"
@@ -214,6 +214,11 @@ func (r *Runner) Run(ctx context.Context) error {
214214
setupLog.Error(err, "Failed to create controller manager")
215215
return err
216216
}
217+
err = setupPprofHandlers(mgr)
218+
if err != nil {
219+
setupLog.Error(err, "Failed to setup pprof handlers")
220+
return err
221+
}
217222

218223
// ===================================================================
219224
// == Latency Predictor Integration
@@ -257,37 +262,40 @@ func (r *Runner) Run(ctx context.Context) error {
257262
}
258263
}
259264

265+
// START DIFF
266+
// below is what was incomming
267+
err = r.parseConfiguration(ctx)
268+
if err != nil {
269+
setupLog.Error(err, "Failed to parse the configuration")
270+
return err
271+
}
272+
273+
// below is what was current
260274
if len(*configText) != 0 || len(*configFile) != 0 {
261-
theConfig, err := config.LoadConfig([]byte(*configText), *configFile)
275+
theConfig, err := loader.LoadConfig([]byte(*configText), *configFile)
262276
if err != nil {
263277
setupLog.Error(err, "Failed to load the configuration")
264278
return err
265279
}
266280

267-
epp := eppHandle{}
268-
instantiatedPlugins, err := config.LoadPluginReferences(theConfig.Plugins, epp)
281+
epp := newEppHandle()
282+
283+
err = loader.LoadPluginReferences(theConfig.Plugins, epp)
269284
if err != nil {
270285
setupLog.Error(err, "Failed to instantiate the plugins")
271286
return err
272287
}
273-
}
274288

275-
r.schedulerConfig, err = scheduling.LoadSchedulerConfig(theConfig.SchedulingProfiles, instantiatedPlugins)
276-
if err != nil {
277-
setupLog.Error(err, "Failed to create Scheduler configuration")
278-
return err
279-
}
280-
281-
err = r.parsePluginsConfiguration(ctx)
282-
if err != nil {
283-
setupLog.Error(err, "Failed to parse plugins configuration")
284-
return err
285-
}
289+
r.schedulerConfig, err = loader.LoadSchedulerConfig(theConfig.SchedulingProfiles, epp)
290+
if err != nil {
291+
setupLog.Error(err, "Failed to create Scheduler configuration")
292+
return err
293+
}
286294

287-
// Add requestcontrol plugins
288-
if instantiatedPlugins != nil {
289-
r.requestControlConfig = requestcontrol.LoadRequestControlConfig(instantiatedPlugins)
295+
// Add requestControl plugins
296+
r.requestControlConfig.AddPlugins(epp.Plugins().GetAllPlugins()...)
290297
}
298+
// END DIFF
291299

292300
// --- Initialize Core EPP Components ---
293301
if r.schedulerConfig == nil {
@@ -470,6 +478,31 @@ func setupDatalayer() (datalayer.EndpointFactory, error) {
470478
return factory, nil
471479
}
472480

481+
func (r *Runner) parseConfiguration(ctx context.Context) error {
482+
if len(*configText) != 0 || len(*configFile) != 0 {
483+
theConfig, err := loader.LoadConfig([]byte(*configText), *configFile)
484+
if err != nil {
485+
return fmt.Errorf("failed to load the configuration - %w", err)
486+
}
487+
488+
epp := newEppHandle(ctx)
489+
490+
err = loader.LoadPluginReferences(theConfig.Plugins, epp)
491+
if err != nil {
492+
return fmt.Errorf("failed to instantiate the plugins - %w", err)
493+
}
494+
495+
r.schedulerConfig, err = loader.LoadSchedulerConfig(theConfig.SchedulingProfiles, epp)
496+
if err != nil {
497+
return fmt.Errorf("failed to create Scheduler configuration - %w", err)
498+
}
499+
500+
// Add requestControl plugins
501+
r.requestControlConfig.AddPlugins(epp.Plugins().GetAllPlugins()...)
502+
}
503+
return nil
504+
}
505+
473506
func initLogging(opts *zap.Options) {
474507
// Unless -zap-log-level is explicitly set, use -v
475508
useV := true
@@ -583,3 +616,24 @@ func (p *predictorRunnable) Start(ctx context.Context) error {
583616
p.predictor.Stop()
584617
return nil
585618
}
619+
620+
// setupPprofHandlers only implements the pre-defined profiles:
621+
// https://cs.opensource.google/go/go/+/refs/tags/go1.24.4:src/runtime/pprof/pprof.go;l=108
622+
func setupPprofHandlers(mgr ctrl.Manager) error {
623+
var err error
624+
profiles := []string{
625+
"heap",
626+
"goroutine",
627+
"allocs",
628+
"threadcreate",
629+
"block",
630+
"mutex",
631+
}
632+
for _, p := range profiles {
633+
err = mgr.AddMetricsServerExtraHandler("/debug/pprof/"+p, pprof.Handler(p))
634+
if err != nil {
635+
return err
636+
}
637+
}
638+
return nil
639+
}

0 commit comments

Comments
 (0)