Skip to content

Conversation

@LikiosSedo
Copy link
Contributor

@LikiosSedo LikiosSedo commented Oct 27, 2025

Dynamic Weight Calculation for HTTPRoute BackendRefs

Resolves: #60


Summary

Implement dynamic weight calculation for HTTPRoute backendRefs based on Application ReadyReplicas to solve multi-tenant load balancing issues.

Problem

HTTPRoute backend weights are statically configured, causing unfair traffic distribution:

  • Applications with many replicas are underutilized
  • Applications with few replicas are overloaded
  • Poor resource efficiency in multi-tenant scenarios

Example:

Tenant A: 10 replicas → receives 33% traffic (should be 62.5%)
Tenant B: 5 replicas  → receives 33% traffic (should be 31.25%)
Tenant C: 1 replica   → receives 33% traffic (should be 6.25%)

Solution

Core Change: Calculate weight dynamically based on ReadyReplicas

Benefits:

  1. Traffic distributed proportionally to application capacity
  2. Automatic adjustment during scaling operations
  3. Support for progressive deployments
  4. Better resource utilization in multi-tenant environments

Implementation

ArksApplication

Before:

if app.Spec.Replicas != int(app.Status.ReadyReplicas) {
    continue  // Skip if not fully Ready
}
Weight: &ep.Spec.DefaultWeight  // Static weight

After:

weight := int32(app.Status.ReadyReplicas)
if weight == 0 {
    continue  // Skip only if zero Ready
}
Weight: &weight  // Dynamic weight

ArksDisaggregatedApplication

Before:

// Require Prefill/Decode fully Ready
if app.Status.Router.ReadyReplicas > 0 &&
    app.Status.Prefill.ReadyReplicas == app.Status.Prefill.Replicas &&
    app.Status.Decode.ReadyReplicas == app.Status.Decode.Replicas {
    Weight: &ep.Spec.DefaultWeight
}

After:

// Use Router.ReadyReplicas as weight
weight := int32(app.Status.Router.ReadyReplicas)
if weight == 0 ||
   app.Status.Prefill.ReadyReplicas == 0 ||
   app.Status.Decode.ReadyReplicas == 0 {
    continue
}
Weight: &weight

Key Changes:

  • ArksApplication: weight = ReadyReplicas
  • ArksDisaggregatedApplication: weight = Router.ReadyReplicas
  • Allow partial Ready state during scaling
  • Enhanced logging with weight information

Testing

Comprehensive testing completed in multi-tenant environment:

Test Scenario Result
Multi-tenant load balancing (1, 10, 5 replicas) PASS: Weights correctly match ReadyReplicas
Progressive deployment (scaling 1→10) PASS: Weight tracks ReadyReplicas during scaling
Edge cases (scale to zero, recovery) PASS: Correctly excluded/re-added
Performance and stability PASS: No regressions, stable controller

Verification:

  • HTTPRoute weights correctly reflect ReadyReplicas
  • Traffic distributed proportionally to capacity
  • Controller remains stable (no restarts, normal resource usage)
  • No performance regression

Compatibility

Backward Compatibility: Fully compatible

  • No API or CRD changes
  • Existing deployments benefit automatically after upgrade
  • No migration required

Breaking Changes: None

Rollback: Safe rollback available if needed


Checklist

  • Changes clearly explained
  • Tests pass successfully
  • Code adheres to project style
  • No regressions introduced
  • Deployment verified in production-like environment

…ackendRefs

  - Calculate weight dynamically using Application.Status.ReadyReplicas
  - ArksApplication: weight = ReadyReplicas (instead of DefaultWeight)
  - ArksDisaggregatedApplication: weight = Router.ReadyReplicas
  - Support progressive deployment (allow partial Ready state)
  - Enhanced logging with weight/replica information

  Resolves: scitix#60
@LikiosSedo LikiosSedo force-pushed the feature/dynamic-weight branch from ef2af01 to 47001b6 Compare October 27, 2025 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant