-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(slices): Make consumers "slice-aware" #3259
Conversation
Codecov ReportBase: 92.93% // Head: 21.81% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3259 +/- ##
===========================================
- Coverage 92.93% 21.81% -71.12%
===========================================
Files 702 663 -39
Lines 32256 31249 -1007
===========================================
- Hits 29976 6817 -23159
- Misses 2280 24432 +22152
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Your change is currently not selecting the correct ClickHouse cluster for the given slice, so that needs to be done before a merge is possible (you can't rely on the changes we made to the query path). Given that the snuba/snuba/datasets/table_storage.py Lines 243 to 250 in efd9d42
|
e7ee9ab
to
311b3f8
Compare
In order to facilitate cluster selection using slice id, I modified One alternative to this would have been to modify the I'll be making more organizational changes to the code next (e.g. putting the logical to physical topic mapping into a function, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pending my comments, I think I'm fine with this approach.
Have you tried testing locally by creating a sliced cluster/topic, manually starting a consumer, and seeing if you can write to slice 1?
A few updates to this PR: Tested consumer behavior using the Removed the logic for mapping logical Snuba topics to physical/sliced topics. Snuba topics are only used for retrieving Kafka broker config in the Consumer builder, and broker config can be found using simply a (logical topic name, slice id) pair. Kept the logic for mapping logical Arroyo topics to physical/sliced topics, and passed this sliced topic into StreamProcessor logic so that the consumer connects to this correct topic (this is Finally, made a modification which may be a small typo in the Topic enum to class PR (since this PR builds on that PR). Will update here as necessary. Next steps include addressing other comments above regarding KafkaTopicSpec, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're going in the right direction to me. Thanks for doing more in-depth testing! It might be annoying but it's definitely worse to find errors post-deploy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think this is looking better. My main comments are around consistency. We need to make the slices behave the same as slice 0, currently there are too many subtle differences.
- Passing slice=0 should work, and it should behave the same as slice=None, or not passing the slice ID. Currently it behaves in pretty unexpected ways and you will probably get something broken if you pass slice=0.
- We need to support the topic overrides and the same features in general for slices. How come you can't override a topic if you are using a slice? I think that code for various slices should behave the same, these small differences are very unexpected
- Validation should also be done in the shared KafkaTopicSpec so it will be automatically run every time we fetch the slice data not implemented and called independently in every separate entrypoint, of which there will be many.
snuba/consumers/consumer_builder.py
Outdated
@@ -109,29 +129,42 @@ def __init__( | |||
|
|||
stream_loader = self.storage.get_table_writer().get_stream_loader() | |||
|
|||
self.raw_topic: Topic | |||
self.raw_topic: ArroyoTopic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personal preference, I'd rather you just left all these names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with leaving the names is the potential confusion between (Snuba) Topic
and an (Arroyo) Topic
. If both are Topic
(how it was originally), it can become confusing to work with the code in future iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is the other topic (SnubaTopic) is being imported here but that should never be the case. It should always be accessed via the stream loader and KafkaTopicSpec mechanism to ensure that topic resolving is done correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. I no longer have those imports. Shall I change ArroyoTopic
back to Topic
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That seems good to me. It feels a little bit more consistent with what the other files do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bumping this. Can you please revert the change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR:
snuba consumer
ConsumerBuilder
logic to take into account any slice id that is passed inregister_topic()
to register slice-specific physical topics that are defined inSLICED_KAFKA_TOPIC_MAP
in settingsNext steps: