Making inferenceModel optional #1024

kfswain · 2025-06-19T19:12:13Z

This PR makes the InferenceModel an optional CRD. If no match is found, the matched model is set to the lowest criticality.

Currently this is hard-coded just to unblock: #1001

Future PRs will allow for configuration of the default (implemented with the implementation of InferenceSchedulingObjective: #1007

netlify · 2025-06-19T19:12:18Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`021aec3`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68546ec45838190008eb1456
😎 Deploy Preview	https://deploy-preview-1024--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

nirrozenbaum · 2025-06-19T19:38:54Z

pkg/epp/requestcontrol/director.go

+		logger.Info("No associated inferenceModel found, using default", "Requested Model", reqCtx.Model)
+		sheddable := v1alpha2.Sheddable
+		modelObj = &v1alpha2.InferenceModel{
+			Spec: v1alpha2.InferenceModelSpec{
+				ModelName:   reqCtx.Model,
+				Criticality: &sheddable,
+			},


I was exactly writing the question "do we need TargetModels" and you removed it :).

so just to verify I got it right - if the requested modelName is not found in datastore, we just passthrough and leave the modelName field as is with lowest criticality.
since it's not found in the datastore we already know it's going to fail, right? no pod has this model.
also in some situations the request might be dropped cause it's a sheddable request.

I understand this is done for conformance, but what exactly are we trying to test here?

Convenience, the model might be available on the model server, but a user doesnt want to set up an InferenceModel for it.

So the req might not necessarily fail

got it 👍

pkg/epp/requestcontrol/director.go

nirrozenbaum

/lgtm

left one nit.
/hold if you want to address the comment.

k8s-ci-robot · 2025-06-19T19:55:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [kfswain]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

nirrozenbaum · 2025-06-19T19:56:31Z

/lgtm

left one nit. /hold if you want to address the comment.

/hold

nirrozenbaum · 2025-06-19T20:25:25Z

/lgtm

nirrozenbaum · 2025-06-19T20:25:35Z

/unhold

ahg-g · 2025-06-20T05:22:06Z

pkg/epp/requestcontrol/director.go

 	if modelObj == nil {
-		return reqCtx, errutil.Error{Code: errutil.BadConfiguration, Msg: fmt.Sprintf("error finding a model object in InferenceModel for input %v", reqCtx.Model)}
+		logger.Info("No associated inferenceModel found, using default", "model", reqCtx.Model)
+		sheddable := v1alpha2.Sheddable


Default should probably be standard, not sheddable.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 19, 2025

k8s-ci-robot requested review from Jeffwan and liu-cong June 19, 2025 19:12

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 19, 2025

kfswain force-pushed the inf-model-optional branch from 50d0abb to bbd6b8b Compare June 19, 2025 19:34

nirrozenbaum reviewed Jun 19, 2025

View reviewed changes

pkg/epp/requestcontrol/director.go Outdated Show resolved Hide resolved

nirrozenbaum approved these changes Jun 19, 2025

View reviewed changes

k8s-ci-robot assigned nirrozenbaum Jun 19, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 19, 2025

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 19, 2025

Making inferenceModel optional

021aec3

kfswain force-pushed the inf-model-optional branch from bbd6b8b to 021aec3 Compare June 19, 2025 20:10

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 19, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 19, 2025

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 19, 2025

k8s-ci-robot merged commit 4433ad5 into kubernetes-sigs:main Jun 19, 2025
9 checks passed

ahg-g reviewed Jun 20, 2025

View reviewed changes

rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 26, 2025

Making inferenceModel optional (kubernetes-sigs#1024)

d09de9c

rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 26, 2025

Making inferenceModel optional (kubernetes-sigs#1024)

f700953

EyalPazz pushed a commit to EyalPazz/gateway-api-inference-extension that referenced this pull request Jul 9, 2025

Making inferenceModel optional (kubernetes-sigs#1024)

121a123

kfswain deleted the inf-model-optional branch July 31, 2025 21:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Making inferenceModel optional #1024

Making inferenceModel optional #1024

Uh oh!

kfswain commented Jun 19, 2025 •

edited

Loading

Uh oh!

netlify bot commented Jun 19, 2025 •

edited

Loading

Uh oh!

nirrozenbaum Jun 19, 2025 •

edited

Loading

Uh oh!

kfswain Jun 19, 2025

Uh oh!

nirrozenbaum Jun 19, 2025

Uh oh!

Uh oh!

nirrozenbaum left a comment •

edited

Loading

Uh oh!

k8s-ci-robot commented Jun 19, 2025

Uh oh!

nirrozenbaum commented Jun 19, 2025

Uh oh!

nirrozenbaum commented Jun 19, 2025

Uh oh!

nirrozenbaum commented Jun 19, 2025

Uh oh!

Uh oh!

ahg-g Jun 20, 2025

Uh oh!

Uh oh!

Making inferenceModel optional #1024

Making inferenceModel optional #1024

Uh oh!

Conversation

kfswain commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

nirrozenbaum Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfswain Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nirrozenbaum left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Jun 19, 2025

Uh oh!

nirrozenbaum commented Jun 19, 2025

Uh oh!

nirrozenbaum commented Jun 19, 2025

Uh oh!

nirrozenbaum commented Jun 19, 2025

Uh oh!

Uh oh!

ahg-g Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kfswain commented Jun 19, 2025 •

edited

Loading

netlify bot commented Jun 19, 2025 •

edited

Loading

nirrozenbaum Jun 19, 2025 •

edited

Loading

nirrozenbaum left a comment •

edited

Loading