Skip to content

Conversation

abrarsheikh
Copy link
Contributor

@abrarsheikh abrarsheikh commented Oct 16, 2025

flaky test

RAY_SERVE_HANDLE_AUTOSCALING_METRIC_PUSH_INTERVAL_S=0.1 \                                                                                                                                                                                                                        RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER=1 \
RAY_SERVE_COLLECT_AUTOSCALING_METRICS_ON_HANDLE=0 \
pytest -svvx "python/ray/serve/tests/test_autoscaling_policy.py::TestAutoscalingMetrics::test_basic[min]"

What I think is the likely cause

When using RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER=1 with min aggregation:

  1. Replicas emit metrics at slightly different times (even if just 10ms apart due to the timestamp bucketing/rounding)

  2. The merged timeseries reflects the ramp-up:

    • At t=0: Maybe only replica 1 is reporting → total = 25 requests
    • At t=0.01: Replica 2 starts reporting → total = 40 requests
    • At t=0.02: Replica 3 starts reporting → total = 50 requests
    • etc.
  3. min aggregation captures the starting point:

    • aggregate_timeseries(..., aggregation_function="min") takes the minimum value from the merged timeseries
    • This will always be one of those initial low values (like 25) when only a subset of replicas had started reporting
    • This value can never be ≥ 45, making the test inherently flaky

Removing min from test fixture.

I think a more robust solution is to keep the last report in the controller, generate the final time series using both reports, then clip the data and mid-point, then apply the aggregation function.

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh requested a review from a team as a code owner October 16, 2025 05:36
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@akyang-anyscale
Copy link
Contributor

what's the reasoning behind the flakiness?

@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Oct 16, 2025
@ray-gardener ray-gardener bot added the serve Ray Serve Related Issue label Oct 16, 2025
@zcin zcin merged commit 978a9af into master Oct 16, 2025
6 checks passed
@zcin zcin deleted the SERVE-1239-abrar-flaky branch October 16, 2025 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants