-
Notifications
You must be signed in to change notification settings - Fork 180
Description
What would you like to be added
Add end-to-end (e2e) tests to verify that Dynamic LoRA Adapter Sidecar metrics can be properly scraped from the metrics endpoint at port 8080.
Why is this needed
As discussed in PR #980, we need comprehensive e2e tests that include test cases for scraping Dynamic LoRA Adapter Sidecar metrics with assertions for the expected metrics. The PR introduces the lora_syncer_adapter_status metric, but we currently lack e2e tests that validate the full metrics scraping workflow in a real deployment scenario.
The Dynamic LoRA Adapter Sidecar currently exposes the following metrics:
- lora_syncer_adapter_status: Status of LoRA adapters (1=loaded, 0=not_loaded) with adapter_name label
Implementation Details
The e2e test should:
- Deploy a vLLM model server with dynamic LoRA adapter sidecar container
- Configure a ConfigMap to specify which LoRA adapters to load/unload
- Trigger adapter loading/unloading operations
- Scrape the metrics endpoint at port 8080
- Assert that the lora_syncer_adapter_status metric is present with appropriate values
- Verify metrics reflect the correct adapter status (loaded/not loaded)
Current Limitations
As mentioned in the PR discussion, the vLLM simulator currently does not support the following endpoints needed for complete e2e testing
- /v1/load_lora_adapter
- /v1/unload_lora_adapter
This issue can be implemented once the vLLM simulator is enhanced to support these endpoints, or by using a real vLLM deployment in the e2e test environment.
[UPDATED] vllm-simulator supports the endpoints. So, there is no limitation for the e2e test.
Related Issues
Resolves discussion in PR #980: #980 (comment)
Depends on: vLLM simulator enhancement to support LoRA adapter endpoints