-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to achieve streaming of Keda scaled jobs #5881
Comments
Any updates on this issue? |
Makes sense, are you willing to contribute a fix? |
This feature can probably resolve the issue. |
@junekhan Should we set any specific parameter in scaledjob spec to resolve this issue? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. |
Report
We are running the Generative AI workloads (GPU resources) using Keda-scaled jobs. We are not able to achieve streaming of Keda-scaled jobs.
Expected Behavior
Scenarios:
Keda scaled job settings --> pollingInterval = 30, maxReplicaCount = 10, parallelism = 1
(Assuming the SQS queue is empty before placing the messages in the queue for the below scenarios)
1 message in the queue → Keda triggers 1 job/pod and processes it. Let’s say the consumer places another message in the queue while 1st job is still running then Keda will not trigger another job until it completes 1st job. So we would expect Keda to process subsequent jobs even if existing jobs are in progress.
FYI: We did achieve streaming of batch jobs on AWS SageMaker where we can create N number of jobs in parallel even if existing jobs are in progress.
Actual Behavior
Scenarios:
(Assuming the SQS queue is empty before placing the messages in the queue for the below scenarios)
1 message in the queue → Keda triggers 1 job/pod and processes it. Let’s say the consumer places another message in the queue while 1st job is still running then Keda will not trigger another job until it completes 1st job.
We tried addressing the above concern with the below settings.
(Assuming the SQS queue is empty before placing the messages in the queue for the below scenarios)
1 message in the queue → Keda triggers 5 jobs and processes 1 job but no use of other jobs/pods. It would be expensive in terms of cost because we are launching 4 GPU pods/jobs unnecessarily. After all, there is only one message in a queue.
2 messages in the queue → Keda triggers 5 jobs and processes 2 jobs but no use of other pods.
Steps to Reproduce the Problem
Keda scaled job settings --> pollingInterval = 30, maxReplicaCount = 10, parallelism = 1
(Assuming the SQS queue is empty before placing the messages in the queue for the below scenarios)
1 message in the queue → Keda triggers 1 job/pod and processes it.
Keda scaled job settings --> pollingInterval = 30, maxReplicaCount = 10, parallelism = 5
(Assuming the SQS queue is empty before placing the messages in the queue for the below scenarios)
1 message in the queue → Keda triggers 5 jobs and processes 1 job.
2 messages in the queue → Keda triggers 5 pods and processes 2 pods.
Logs from KEDA operator
No response
KEDA Version
2.14.0
Kubernetes Version
1.29
Platform
Amazon Web Services
Scaler Details
AWS SQS Queue
Anything else?
No response
The text was updated successfully, but these errors were encountered: