-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-13569][STREAMING][KAFKA] pattern based topic subscription #14026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@tdas @zsxwing This should be the last ConsumerStrategy implementation to have basic parity with what's offered by the kafka consumer, anything else should probably be handled by user subclasses. If the KAFKA-3370 workaround stuff isn't clear... the basic issue is that you have to poll in order to get partition assignments before setting a position... but if you poll with auto offset none, it will throw an exception because you don't have a position yet :) |
Test build #61649 has finished for PR 14026 at commit
|
// silence exception | ||
} | ||
toSeek.asScala.foreach { case (topicPartition, offset) => | ||
consumer.seek(topicPartition, offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 chars for indent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Foreach is a scope, case is a nested scope.
currentOffsets | ||
} | ||
if (!toSeek.isEmpty) { | ||
// work around KAFKA-3370 when reset is none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you comment to explain the problem in short and the work around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will add comment in the code once I'm back at my workstation.
Test build #61889 has finished for PR 14026 at commit
|
Test build #61892 has finished for PR 14026 at commit
|
I am going to merge this PR in the interest of RC2. But I would like to have more tests for testing the conditions that led to the poll+seek to be added. These subtle behaviors should not go untested. Would be very good if you open another PR to added more tests. |
## What changes were proposed in this pull request? Allow for kafka topic subscriptions based on a regex pattern. ## How was this patch tested? Unit tests, manual tests Author: cody koeninger <cody@koeninger.org> Closes #14026 from koeninger/SPARK-13569. (cherry picked from commit fd6e8f0) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
What changes were proposed in this pull request?
Allow for kafka topic subscriptions based on a regex pattern.
How was this patch tested?
Unit tests, manual tests