Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cron-scaler scales higher than expected #5820

Closed
dagvl opened this issue May 22, 2024 · 4 comments
Closed

cron-scaler scales higher than expected #5820

dagvl opened this issue May 22, 2024 · 4 comments
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity

Comments

@dagvl
Copy link

dagvl commented May 22, 2024

Report

When using cron triggers and a scaleDown policy of 5% pods every 2 seconds, the deployment never scales down to the expected number of pods.

E.g. if i have a cron trigger saying 20 pods, and then I edit that cron trigger to 10 pods, the deployment scales down to 11 pods instead of 10.

This is related to the scaleDown policy, because if i set a policy of 100% Pods every 2 seconds, it correctly scales down to 10 pods.

Expected Behavior

I expect the number of replicas to match the desiredReplicas in the cron trigger

Actual Behavior

I get more than 10 replicas.

Steps to Reproduce the Problem

First create a ScaledObject referencing a deployment with a cron trigger requesting 20 pods:

spec:
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          policies:
          - periodSeconds: 2
            type: Percent
            value: 5
          - periodSeconds: 2
            type: Pods
            value: 1
          stabilizationWindowSeconds: 2
      name: scaled-object-test-hpa
    scalingModifiers: {}
  cooldownPeriod: 2
  fallback:
    failureThreshold: 3
    replicas: 1
  maxReplicaCount: 200
  minReplicaCount: 10
  pollingInterval: 30
  scaleTargetRef:
    name: scaled-object-test
  triggers:
  - metadata:
      value: "80"
    metricType: Utilization
    type: cpu
  - metadata:
      desiredReplicas: "20"
      end: 59 23 * * 6
      start: 0  0  * * 0
      timezone: Europe/Oslo
    type: cron

(note that this has a CPU utilization trigger also just because of the internal tooling we use to generate the ScaledObject, but this trigger is not a factor as average CPU usage is 0% in my pods [its an idle nginx container]).

Note the scaleDown setting allows max(1, pods*0.05) to scale down.

Apply this scaledobject and see that the deployment scales up to 20

then change desiredReplicas to 10 and reapply.

The deployment starts to slowly scale down, but the scaledown ends at 11 replicas instead of 10.

If you set the policy to 100% percent and do the same thing, the scaledown ends at 10 pods as expected.

Logs from KEDA operator

No response

KEDA Version

2.14.0

Kubernetes Version

1.29

Platform

Amazon Web Services

Scaler Details

cron

Anything else?

One thought that crossed my mind, but i can't verify ,is that this is the HPA scaling to within a tolerance level instead of to a exact value.

E.g. right now, i have the cron desiredReplicas set to 10, but the deployment is stuck at 11.

If i look at the HPA, i see this:

  "s1-cron-Europe-Oslo-00xx0-5923xx6" (target average value):  910m / 1

10/11 = 0.910 which seems like cron is emitting the correct metric but the hpa is not reacting to it. In the production case it is similar:
The scaledObject cron trigger is emitting 220 desiredReplicas, but we have 244 currently. Looking at the hpa we have:

  "s1-cron-Australia-Sydney-01xx1-010xx4" (target average value):  902m / 1

220/244 = 0.902 so again we are within 10% of the target value

@dagvl dagvl added the bug Something isn't working label May 22, 2024
@JorTurFer
Copy link
Member

JorTurFer commented May 26, 2024

Hello,
You're right, the problem here is the 10% of tolerance and currently there isn't any solution :(
IDK if @SpiritZhou will finally contribute to the upstream with this feature, do you have any extra info @SpiritZhou ?

@SpiritZhou
Copy link
Contributor

I am still working on it.

Copy link

stale bot commented Jul 26, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Jul 26, 2024
Copy link

stale bot commented Aug 3, 2024

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity
Projects
None yet
Development

No branches or pull requests

3 participants