-
Notifications
You must be signed in to change notification settings - Fork 179
profile handler ProcessResult returns additional return value #1013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profile handler ProcessResult returns additional return value #1013
Conversation
…alue Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
cc @kfswain |
// It may aggregate results, log test profile outputs, or apply custom logic. It specifies in the SchedulingResult the | ||
// key of the primary profile that should be used to get the request selected destination. | ||
ProcessResults(ctx context.Context, request *types.LLMRequest, profileResults map[string]*types.ProfileRunResult) *types.SchedulingResult | ||
// When a profile run fails, its result in the profileResults map is nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confirming: it will be an empty entry (key + nil), not no entry (no key at all)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. if prefill for example fail we will have an entry in the map "prefill" -> nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want SchedulingResult to carry an error so that erroring out is explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that depends on the plugin implementation. whoever implements ProfileHandler can decide to filter out from the SchedulingResult the profiles that failed, or alternatively leave them with nil for being explicit. both are possible.
In GIE there is a single profile. having a failure means we cannot schedule the request. in different scenarios like llm-d PD it’s not always the case (see description in the PR intro)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant changing ProfileRunResult to include the error parameter instead of assuming that there is an error when it is nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as part of the scheduling new design, one of the first changes that were made was to remove the error return value from all extensions of a profile. that is - filter, scorer and picker - none of them returns an error.
the only case where an error is returned from a Profile run is when no pods left available after the filter phase.
see here:
gateway-api-inference-extension/pkg/epp/scheduling/framework/scheduler_profile.go
Lines 109 to 122 in e29fa4b
func (p *SchedulerProfile) Run(ctx context.Context, request *types.LLMRequest, cycleState *types.CycleState, podsSnapshot []types.Pod) (*types.ProfileRunResult, error) { | |
pods := p.runFilterPlugins(ctx, request, cycleState, podsSnapshot) | |
if len(pods) == 0 { | |
return nil, errutil.Error{Code: errutil.Internal, Msg: "no pods available for the given request"} | |
} | |
// if we got here, there is at least one pod to score | |
weightedScorePerPod := p.runScorerPlugins(ctx, request, cycleState, pods) | |
result := p.runPickerPlugin(ctx, cycleState, weightedScorePerPod) | |
p.runPostCyclePlugins(ctx, cycleState, result) | |
return result, nil | |
} |
additionally, we can notice that when that happens, the returned ProfileRunResult is nil.
in the scheduler itself, the code in this PR is implemented to include the nil result when error happens.
see here:
gateway-api-inference-extension/pkg/epp/scheduling/scheduler.go
Lines 122 to 130 in e29fa4b
for name, profile := range profiles { | |
// run the selected profiles and collect results (current code runs all profiles) | |
profileRunResult, err := profile.Run(ctx, request, cycleState, podsSnapshot) | |
if err != nil { | |
loggerDebug.Info("failed to run scheduler profile", "profile", name, "error", err.Error()) | |
} | |
profileRunResults[name] = profileRunResult // if profile failed to run, the run result is nil | |
} |
I feel comfortable with leaving it as is, but if you have a strong opinion about adding explicit error field to ProfileRunResult we can do that. just don't think it's necessary, not atm at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
break | ||
} | ||
|
||
if profileResults[singleProfileName] == nil { // there was an error while running the profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, it seems that you are special casing the first profile, even when there are multiple.
Consider raising the error only when there's a single profile result and it is nil (i.e., special case on if len(profileResults) == 1 { ... }
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this profile handler is intended to be used for a single profile, as the name suggests SingleProfileHandler
.
added validation that it includes a single profile.
Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
/approve Holding in case we want to place the error inside SchedulingResult. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, nirrozenbaum The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
after this PR - /unhold |
…etes-sigs#1013) * profile handler ProcessResult returns an error as additional return value Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * minor update Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * fixed log Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * validate that single profile handler process only single profile Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
…etes-sigs#1013) * profile handler ProcessResult returns an error as additional return value Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * minor update Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * fixed log Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * validate that single profile handler process only single profile Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
…etes-sigs#1013) * profile handler ProcessResult returns an error as additional return value Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * minor update Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * fixed log Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * validate that single profile handler process only single profile Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
This PR make minor change to scheduler and ProfileHandler and delegates the decision on whether the Schedule call should fail or not when a single profile run fails.
more background about this PR:
in llm-d we have two configured profiles - Prefill and Decode.
if decode profile fails, we would like to fail the request. on the other hand, if prefill profile run fails, we would like to serve the request by decode only. this could happen if there is no available prefill pod at the time the request is sent.
More generally, since profiles and plugins are extensible and pluggable, it makes sense to delegate the decision on whether failing profile should fail the whole request or not to ProcessResults.
in the case where profile fails, it's profile result will be nil.