-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Apache Kafka scaler to scale using sum of lag for all topics within a consumer group #2409
Allow Apache Kafka scaler to scale using sum of lag for all topics within a consumer group #2409
Conversation
/run-e2e kafka.test* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert in Kafka but LGTM.
Could you add a unit test checking the new metric name please?
You can add the test case here
hi @PaulLiang1 |
68650ca
to
d745707
Compare
/run-e2e kafka.test* |
Hi @JorTurFer , thanks for your help. |
Thanks @PaulLiang1 , |
Hi @JorTurFer , turns out i previously had some mis-understanding for |
/run-e2e kafka.test* |
Hi @JorTurFer . the tests passed. would you mind take another look at the PR? thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Only a little suggestion
/run-e2e kafka.test* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for this contribution ❤️
(in any case, let's wait till other pair of eyes takes a look at this)
Any particular reason you chose to implement this, rather than setting up different kafka triggers for each topic individually, all within the same scaledObject? I have no idea what your applications do, but I'd imagine the different kafka topics to be producing different messages that get consumed in very different ways. Is it really valuable to be having a mixed measure that doesn't discriminate based on what the messages really are? I do have consumers that consume from multiple topics, and I've had good success with tracking each topic separately with different thresholds set to account for different volumes/computational intensity to process the different message types. |
Thanks for the discussion. Context:
|
I'll admit I never considered the possibility of somebody consuming from 480 topics at once :) It would definitely be good to emphasize the autodiscovery behaviour that leaving the topic string empty would be then in the docs. I've had problems before in almost the inverse situation where I was accidentally scaling based off of the total lag of all consumer groups on a topic. One limitation I can think of is that the autodiscovery is limited to a single Kafka cluster. All topics you discovery from must be present within the same Cluster. You could, of course, just supply multiple kafka scaler triggers each autodiscovering a different cluster if necessary. Just something else to maybe highlight in the docs. I suppose the other thing to be aware of is that with 480 topics, any significant number of partitions per topic will quickly explode to a very large total number of partitions that must be queried. There is currently another active PR to ensure querying brokers is concurrent. By the time this PR gets included in the next public release, concurrency should alleviate any performance concerns there. The PR seems sane to me. |
PR for doc: kedacore/keda-docs#613
In the example provided above, topics from other geo location are mirrored into a single aggregation cluster, where the consumer is only consuming from the single cluster.
Correct. This change does not forbid that. |
@PaulLiang1 do you think you can a few words about this to the docs? So users are aware of the consequences? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PaulLiang1 Could you please rebase your PR, there are conflicts because of #2409
Thanks!
Sure, sorry just back from holidays. will work on it in the next few days |
Signed-off-by: Jinli Liang <paul.liang@rokt.com>
Signed-off-by: Jinli Liang <paul.liang@rokt.com>
Signed-off-by: Jinli Liang <paul.liang@rokt.com>
36a3ae4
to
f8b6735
Compare
Hi @zroubalik
|
/run-e2e kafka.test* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Once the Doc PR is fixed, we can merge this one. Great job @PaulLiang1
@PaulLiang1 Thank you for this! What would be the maximum replicas? Is it the sum of partitions of all the topics in a consumer group? For example. Consumer group has Topic 1 with 10 partitions, and Topic 2 with 6 partitions. Will it scale up to 16 pods or 10 pods? Edit: Oops, just found the answer to my question as soon as I posted this! |
ref: |
Allow kafka scaler to use sum of lag for all topic partition when no
topic
is supplied.This is useful when the consumer is subscribed to multiple topics;
Checklist