generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 180
Closed
Description
What would you like to be added
Add end-to-end (e2e) tests to verify that EPP metrics can be properly scraped from the metrics endpoint at port 9090.
Why is this needed
As discussed in PR #980, we need comprehensive e2e tests that include test cases for scraping EPP metrics with assertions for the expected metrics. While we have integration tests that verify metrics functionality, we currently lack e2e tests that validate the full metrics scraping workflow in a real deployment scenario.
The EPP currently exposes the following metrics:
- inference_model_request_total
- inference_model_request_error_total
- inference_model_request_duration_seconds
- normalized_time_per_output_token_seconds
- inference_model_request_sizes
- inference_model_response_sizes
- inference_model_input_tokens
- inference_model_output_tokens
- inference_model_running_requests
- inference_pool_average_kv_cache_utilization
- inference_pool_average_queue_size
- inference_pool_per_pod_queue_size
- inference_pool_ready_pods
- inference_extension_info
Implementation Details
The e2e test should:
- Deploy an EPP configuration
- Send inference requests to generate metrics
- Scrape the metrics endpoint at port 9090
- Assert that expected metrics are present with appropriate values
- Verify metrics behavior under different scenarios (successful requests, error conditions, etc.)
Related Issues
Resolves discussion in PR #980: #980 (comment)
Metadata
Metadata
Assignees
Labels
No labels