Skip to content

Conversation

irar2
Copy link
Contributor

@irar2 irar2 commented May 12, 2025

Added unit tests for the EPP gRPC streaming server.

In particular, this test tests the streaming server's processing of the various Envoy gRPC messages. An in-memory buffered listener is used for the gRPC server, thus eliminating the need to find free ports in the test system. The code has been structured with several helper functions, that should make it easier in the future to add more tests of this nature. In particular, this should enable end-to-end tests without the need to find free ports.

This test uses a dummy director to choose the target pod. The goal here was to test the streaming server, to setup infrastructure for more similar tests, and not necessarily test the director or the scheduler.

Copy link

linux-foundation-easycla bot commented May 12, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: irar2 / name: Ira Rosen (a7a9412)

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and robscott May 12, 2025 09:00
@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label May 12, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @irar2!

It looks like this is your first PR to kubernetes-sigs/gateway-api-inference-extension 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api-inference-extension has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 12, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @irar2. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

netlify bot commented May 12, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit a7a9412
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/683eb7c5b9179c0008ddcf6f
😎 Deploy Preview https://deploy-preview-820--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 12, 2025
@nirrozenbaum
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 12, 2025
@nirrozenbaum
Copy link
Contributor

/retest

@danehans
Copy link
Contributor

danehans commented Jun 2, 2025

@irar2 thanks for the PR. Can you resolve the review feedback and update the PR?

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 3, 2025
Signed-off-by: Ira <IRAR@il.ibm.com>
@irar2 irar2 force-pushed the server-test-main branch from bcce561 to a7a9412 Compare June 3, 2025 08:52
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 3, 2025
@irar2
Copy link
Contributor Author

irar2 commented Jun 3, 2025

In making changes in response to the review comments above, I tried to do a rebase. This resulted in 185 modified files. In this PR there are only two modified files.
To correct this, I did a soft reset to the upstream main, added back my two new files, and did a forced push.

@irar2 irar2 changed the title test: Initial end-to-end test with gRPC messages test: gRPC server unit test and utilities for further end-to-end tests Jun 8, 2025
@irar2 irar2 changed the title test: gRPC server unit test and utilities for further end-to-end tests test: gRPC server unit tests and utilities for further end-to-end tests Jun 8, 2025
@nirrozenbaum
Copy link
Contributor

/test pull-gateway-api-inference-extension-test-e2e-main

@kfswain
Copy link
Collaborator

kfswain commented Jun 16, 2025

This LGTM, any reason not to merge?

/approve
/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 16, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 16, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: irar2, kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 16, 2025
@nirrozenbaum
Copy link
Contributor

This LGTM, any reason not to merge?

/approve /lgtm /hold

no reason not to merge.
the only reason from my side was I had no free cycles to review.

/unhold

@irar2 Thanks and sorry for the delayed review!

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 17, 2025
@k8s-ci-robot k8s-ci-robot merged commit 5ff1e27 into kubernetes-sigs:main Jun 17, 2025
9 checks passed
shmuelk pushed a commit to shmuelk/gateway-api-inference-extension that referenced this pull request Jun 18, 2025
k8s-ci-robot pushed a commit that referenced this pull request Jun 18, 2025
…e it easier to add plugins (#881)

* configuration implementation (after rebase...)

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Moved plugin registry back to pkg/epp/plugins

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Removed unneeded 'forced imports' of scorers

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Changed 'profilepicker' to 'profilehandler' in new and old code

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Pass the configured SchedulingProfiles to LoadSchedulerConfig

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Ensure that both the configText and configFile flags are not specified

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Load RequestControl plugins from the configuration

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Register all plugin factories

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Review fixes

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Reverted unneeded change

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updates from review comments

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Added a stub interface for plugins to get data from the EPP

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Added a temporary implementation of plugins.Handle

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Added pluginName and plugins.Handle to plugin factory interface

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updated plugin factory signatures to reflect new API

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updated plugin instantiation to reflect new API

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updated plugin instantiation to reflect new API

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Updated tests to reflect new API

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Do not rename the imported package

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Only upper layer of code should log errors

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Only pass what is needed to instantiate the plugins

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Review updates

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Review update

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Review update. Make more clear that the code only checks for already defined names

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* fixed e2e doc in makefile (does not require GPUs) (#976)

Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>

* API: Adds 5xx Status Code for Invalid ExtRef (#991)

Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io>

* feat(conformance): Add test for invalid EPP service reference (#959)

* fix boilerplate header

* add tests for InferencePoolInvalidEPPService

* change to expect error on httproute refcond

* moved the creation of the context to main.go. (#995)

this is useful when writing a different main like llm-d, allowing to propogate the same context to the whole system.

Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>

* fix dead links (#989)

* feat: add health check for epp cluster (#966)

* feat: add health check for epp cluster

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* remove tls

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* don't use tls

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* health checking flag

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* fix import

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* add tls options

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

---------

Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>

* Server unit test and utility to help with such tests (#820)

Signed-off-by: Ira <IRAR@il.ibm.com>

* Update dynamic-lora-sidecar to expose metrics to track loaded adapters (#980)

* Add a metrics to track loaded adapters

* Update the sample manifests

* Add explanation of metrics from dyanmic LoRA adapter sidecar

* Add explanation of metrics from dyanmic LoRA adapter sidecar (take 2)

* Update metrics.md based on feedback

* refactor: Replace prefix cache structure with golang-lru (#928)

* refactor: Replace prefix cache structure with golang-lru

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Co-authored-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* fix: rename prefix scorer parameters and convert test to benchmark test

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

* feat: Add per server LRU capacity

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

* fix: Fix typos and error handle

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

* fix: add safety check for LRUCapacityPerServer

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>

---------

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Co-authored-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* feat(conformance): Add HTTPRouteMultipleRulesDifferentPools test (#834)

* copy of accepted inference pool test to start from.

* add yaml file for the test

* update time out

* update the yaml file to add port 9002

* read timeout config from local repo

* remove excess comments

* correct spelling for scenarios

* check route condition on RouteConditionResolvedRefs

* remove empty lines in yaml

* set optional/defaulted fields as unspecified

* fix timeout

* fix boilerplate header

* change varialbe names to use primary secondary consistently.

* remove extra comments

* factor out common code

* Add actual http traffic validation using echo-basic

* remove extra comments from manifest

* remove modifiedTimeoutConfig.HTTPRouteMustHaveCondition per review comment.

* intermediate update

* fix the test run

* factor out common code

* move epp def to shared manifest

* remove extra comments

* revert back to two epps

* add to do for epp image

* switch to GeneralMustHaveConditionTimeout

* undo gateway version changes

* remove unused HTTPRouteMustHaveConditions

* update doc string for GetPod

* update docstring

* Remove resource type from names in manifests.

* remove type from name

* remove health check

* add todo for combining getpod methods

* configuration implementation (after rebase...)

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* After review, made code more obvious

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Fixed merge issues

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

---------

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Nir Rozenbaum <nirro@il.ibm.com>
Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io>
Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>
Signed-off-by: Ira <IRAR@il.ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Co-authored-by: Nir Rozenbaum <nirro@il.ibm.com>
Co-authored-by: Daneyon Hansen <daneyon.hansen@solo.io>
Co-authored-by: sina chavoshi <chavoshi@google.com>
Co-authored-by: Xudong Wang <68834160+caozhuozi@users.noreply.github.com>
Co-authored-by: Zhengke Zhou <madzhou1@gmail.com>
Co-authored-by: Ira Rosen <irar@il.ibm.com>
Co-authored-by: Shotaro Kohama <khmshtr28@gmail.com>
Co-authored-by: Kfir Toledo <kfir.toledo@gmail.com>
Co-authored-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants