-
Notifications
You must be signed in to change notification settings - Fork 180
feat: Add support to invoke PostResponse plugins #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support to invoke PostResponse plugins #800
Conversation
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
…gins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Hi @shmuelk. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
pkg/epp/scheduling/types/types.go
Outdated
func NewSchedulingContext(ctx context.Context, req *LLMRequest, resp *LLMResponse, pods []Pod) *SchedulingContext { | ||
var logger logr.Logger | ||
if req != nil { | ||
logger = log.FromContext(ctx).WithValues("request", req) | ||
} else { | ||
logger = log.FromContext(ctx).WithValues("response", resp) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it makes sense to separate SchedulingRequestContext and SchedulingResponseContext?
for example, when using context for response, do you use PodSnapshot or only the selected pod?
maybe we could put TargetPod
in ResponseContext
instead of calculating PodsSnapshot
which could be expensive in large scale scenario.
would be good if each gets only the data it uses and not super-object that contains all.
Apologies for the delay! I think there is some refactoring we could do here to make this implementation a little cleaner, but these were decisions made before this PR. This does move the needle forward, and actually wires up the /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kfswain, shmuelk The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
* generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * typo Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make linter happy Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
* remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * rebase handling Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
…headers in them Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
/lgtm |
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates to tests Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * typo Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make linter happy Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * rebase handling Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> Co-authored-by: Nir Rozenbaum <nirro@il.ibm.com>
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates to tests Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * typo Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make linter happy Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * rebase handling Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> Co-authored-by: Nir Rozenbaum <nirro@il.ibm.com>
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates to tests Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * typo Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make linter happy Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * rebase handling Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> Co-authored-by: Nir Rozenbaum <nirro@il.ibm.com>
* Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to NewSchedulerContext API change Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Populate the RequestId field of LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates to tests Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added PostResponse plugins to scheduler config Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added scheduler.OnResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added dispatcher.HandleResponse to handle responses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Refactored server response header handling to invoke PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Added simple test for PostResponse plugins Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Setup the logger in the SchedulerContext appropriately for reponses Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Updates due to rebase issues * merge functions in env utils (kubernetes-sigs#819) Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * generalize scheduling cycle state concept (kubernetes-sigs#818) * generalize scheduling cycle state concept Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * typo Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make linter happy Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * make prefix state struct internal to package instead of public Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * remove Model field from LLMRequest (kubernetes-sigs#782) * remove Model field from LLMRequest Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * rebase handling Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> --------- Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> * Added the LLMResponse struct and RequestId to LLMRequest Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Insure that wanted response header messages have all of the response headers in them Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com> Co-authored-by: Nir Rozenbaum <nirro@il.ibm.com>
This PR adds the support to invoke the PostResponse scheduler plugins during the processing of response headers.
This PR with the PostResponse plugins invoked at response header processing time, enables developers to add headers to those that are sent to the client. This might include a session id or session token, usefull in session aware routing.
The changes included in this PR include:
A more complex test that includes sending the gRPC messages will be added in a future PR that we plan to submit.
Note: This PR is dependent on PR #799