feat(slices): Make consumers "slice-aware" #3259

ayirr7 · 2022-10-14T05:58:53Z

This PR:

Passes in slice id as a CLI argument to snuba consumer
When relevant, converts the default logical topic to the physical topic (specific to the slice) using sliced topic configurations in settings. This mapping is performed for the input consumption topic and the commit log topic.
Mainly modifies the ConsumerBuilder logic to take into account any slice id that is passed in
Builds upon the changes in the "Topic enum to class" PR, and uses register_topic() to register slice-specific physical topics that are defined in SLICED_KAFKA_TOPIC_MAP in settings

Next steps:

Add some sort of verification that allows us to check whether the slice id being passed in is valid for the specific storage
Further testing

codecov-commenter · 2022-10-14T06:02:59Z

Codecov Report

Base: 92.93% // Head: 21.81% // Decreases project coverage by -71.11% ⚠️

Coverage data is based on head (dde5c5a) compared to base (54e8a43).
Patch coverage: 6.94% of modified lines in pull request are covered.

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #3259       +/-   ##
===========================================
- Coverage   92.93%   21.81%   -71.12%     
===========================================
  Files         702      663       -39     
  Lines       32256    31249     -1007     
===========================================
- Hits        29976     6817    -23159     
- Misses       2280    24432    +22152

Impacted Files	Coverage Δ
snuba/cli/consumer.py	`0.00% <0.00%> (-96.08%)`	⬇️
snuba/cli/test_consumer.py	`0.00% <ø> (-71.43%)`	⬇️
snuba/consumers/consumer.py	`0.00% <0.00%> (-84.40%)`	⬇️
snuba/consumers/consumer_builder.py	`0.00% <0.00%> (-91.90%)`	⬇️
tests/datasets/test_table_storage.py	`0.00% <0.00%> (ø)`
tests/utils/streams/test_kafka_config.py	`0.00% <0.00%> (-100.00%)`	⬇️
snuba/utils/streams/configuration_builder.py	`47.82% <16.66%> (-52.18%)`	⬇️
snuba/datasets/table_storage.py	`67.93% <28.57%> (-26.47%)`	⬇️
snuba/datasets/partitioning.py	`38.88% <33.33%> (-61.12%)`	⬇️
tests/base.py	`0.00% <0.00%> (-100.00%)`	⬇️
... and 650 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

onewland · 2022-10-14T15:22:37Z

Your change is currently not selecting the correct ClickHouse cluster for the given slice, so that needs to be done before a merge is possible (you can't rely on the changes we made to the query path).

Given that the TableWriter gives you access to the Kafka topic (indirectly via the stream loader) AND selects the cluster, I wonder if you could pass through the slice ID to TableWriters creation or to get_table_writer() and hide more slice awareness from the consumer builder

snuba/snuba/datasets/table_storage.py

Lines 243 to 250 in efd9d42

    
           return get_cluster(self.__storage_set).get_batch_writer( 
        
               metrics, 
        
               insert_statement, 
        
               encoding=None, 
        
               options=options, 
        
               chunk_size=chunk_size, 
        
               buffer_size=0, 
        
           )

snuba/consumers/consumer_builder.py

snuba/datasets/storages/functions.py

ayirr7 · 2022-10-14T23:48:29Z

Given that the TableWriter gives you access to the Kafka topic (indirectly via the stream loader) AND selects the cluster, I wonder if you could pass through the slice ID to TableWriters creation or to get_table_writer() and hide more slice awareness from the consumer builder

In order to facilitate cluster selection using slice id, I modified get_table_writer() and the associated superclass to take in slice id. This populates a TableWriter object's slice id attribute, which I added. Now, we can get a slice- and cluster-specific BatchWriter from the TableWriter object.

One alternative to this would have been to modify the WritableTableStorage class to take in slice id upon initialization. However, it seemed to me that this would be a more invasive/refactoring-heavy change.

I'll be making more organizational changes to the code next (e.g. putting the logical to physical topic mapping into a function, etc.)

snuba/datasets/table_storage.py

snuba/datasets/storage.py

onewland

pending my comments, I think I'm fine with this approach.

Have you tried testing locally by creating a sliced cluster/topic, manually starting a consumer, and seeing if you can write to slice 1?

snuba/consumers/consumer_builder.py

ayirr7 · 2022-10-20T01:15:19Z

Have you tried testing locally by creating a sliced cluster/topic, manually starting a consumer, and seeing if you can write to slice 1?

A few updates to this PR:

Tested consumer behavior using the snuba-generic-metrics topic on slice_id = 1 (physical topic snuba-generic-metrics-1). I was able to successfully produce to and consume from snuba-generic-metrics-1, with the stroage being generic_metrics_distributions_raw. I've written up the configuration changes and commands I used for testing on a Notion doc, which I am happy to share.

Removed the logic for mapping logical Snuba topics to physical/sliced topics. Snuba topics are only used for retrieving Kafka broker config in the Consumer builder, and broker config can be found using simply a (logical topic name, slice id) pair.

Kept the logic for mapping logical Arroyo topics to physical/sliced topics, and passed this sliced topic into StreamProcessor logic so that the consumer connects to this correct topic (this is snuba-generic-metrics-1 for our example above).

Finally, made a modification which may be a small typo in the Topic enum to class PR (since this PR builds on that PR). Will update here as necessary.

Next steps include addressing other comments above regarding KafkaTopicSpec, etc.

onewland

Looks like we're going in the right direction to me. Thanks for doing more in-depth testing! It might be annoying but it's definitely worse to find errors post-deploy

snuba/consumers/consumer_builder.py

snuba/datasets/storage.py

snuba/consumers/consumer_builder.py

snuba/utils/streams/configuration_builder.py

snuba/datasets/table_storage.py

snuba/utils/streams/configuration_builder.py

lynnagara

Thanks, I think this is looking better. My main comments are around consistency. We need to make the slices behave the same as slice 0, currently there are too many subtle differences.

Passing slice=0 should work, and it should behave the same as slice=None, or not passing the slice ID. Currently it behaves in pretty unexpected ways and you will probably get something broken if you pass slice=0.
We need to support the topic overrides and the same features in general for slices. How come you can't override a topic if you are using a slice? I think that code for various slices should behave the same, these small differences are very unexpected
Validation should also be done in the shared KafkaTopicSpec so it will be automatically run every time we fetch the slice data not implemented and called independently in every separate entrypoint, of which there will be many.

snuba/consumers/consumer_builder.py

lynnagara · 2022-10-31T21:23:13Z

snuba/consumers/consumer_builder.py

@@ -109,29 +129,42 @@ def __init__(

        stream_loader = self.storage.get_table_writer().get_stream_loader()

-        self.raw_topic: Topic
+        self.raw_topic: ArroyoTopic


Personal preference, I'd rather you just left all these names.

The issue with leaving the names is the potential confusion between (Snuba) Topic and an (Arroyo) Topic. If both are Topic (how it was originally), it can become confusing to work with the code in future iterations.

I think the issue is the other topic (SnubaTopic) is being imported here but that should never be the case. It should always be accessed via the stream loader and KafkaTopicSpec mechanism to ensure that topic resolving is done correctly.

This makes sense. I no longer have those imports. Shall I change ArroyoTopic back to Topic?

Yes. That seems good to me. It feels a little bit more consistent with what the other files do.

Bumping this. Can you please revert the change?

snuba/consumers/consumer_builder.py

snuba/utils/streams/configuration_builder.py

snuba/consumers/consumer_builder.py

…mers-cli

snuba/consumers/consumer_builder.py

snuba/datasets/storage.py

tests/datasets/test_table_storage.py

snuba/consumers/consumer_builder.py

nikhars

LGTM

ayirr7 requested a review from a team as a code owner October 14, 2022 05:58

ayirr7 changed the base branch from master to enoch/remove-topics-enum October 14, 2022 06:00

onewland reviewed Oct 14, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

enochtangg force-pushed the enoch/remove-topics-enum branch from e7ee9ab to 311b3f8 Compare October 14, 2022 16:59

volokluev reviewed Oct 14, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

snuba/datasets/storages/functions.py Outdated Show resolved Hide resolved

fpacifici reviewed Oct 15, 2022

View reviewed changes

snuba/datasets/table_storage.py Show resolved Hide resolved

Base automatically changed from enoch/remove-topics-enum to master October 17, 2022 14:16

ayirr7 changed the base branch from master to enoch/remove-topics-enum October 17, 2022 23:03

ayirr7 requested a review from onewland October 18, 2022 20:10

onewland reviewed Oct 18, 2022

View reviewed changes

snuba/datasets/storage.py Outdated Show resolved Hide resolved

onewland reviewed Oct 18, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

lynnagara reviewed Oct 19, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

lynnagara reviewed Oct 19, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

enochtangg mentioned this pull request Oct 19, 2022

ref(MDC): Topics Enum to Class (with fix) #3268

Closed

onewland reviewed Oct 20, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

snuba/datasets/storage.py Outdated Show resolved Hide resolved

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

ayirr7 changed the base branch from enoch/remove-topics-enum to master October 20, 2022 18:17

enochtangg and others added 10 commits October 20, 2022 11:51

Create topics class and registry

aac106c

Rename topic keys to mirror logical values

8088d3d

Move topics list to settings

36c14be

Add slicing to configuration builder

18340a0

Add slicing to consumer builder

56223ae

Add slicing logic to consumer cli

ce6e3e2

Add slice_id dependent commit log topic

0870749

Add clarifying comments

7b49b9d

Test changes/add comments

ec2e7ba

Attempt to rebase

5e6f762

Riya Chakraborty added 2 commits October 30, 2022 23:52

Moving logical to physical topic mapping into KafkaTopicSpec

5375282

Add docstring comment for validation

9510a50

ayirr7 requested a review from nikhars October 31, 2022 07:03

nikhars reviewed Oct 31, 2022

View reviewed changes

snuba/utils/streams/configuration_builder.py Show resolved Hide resolved

snuba/datasets/table_storage.py Show resolved Hide resolved

snuba/utils/streams/configuration_builder.py Show resolved Hide resolved

lynnagara requested changes Oct 31, 2022

View reviewed changes

Add test file

fd44db3

lynnagara reviewed Nov 4, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

lynnagara reviewed Nov 4, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

lynnagara reviewed Nov 4, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

Riya Chakraborty added 2 commits November 4, 2022 16:39

Merge branch 'master' of github.com:getsentry/snuba into sliced-consu…

ecf06df

…mers-cli

Consistent topic handling and resolving

7ba8ce6

lynnagara reviewed Nov 5, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

lynnagara reviewed Nov 5, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Outdated Show resolved Hide resolved

lynnagara reviewed Nov 5, 2022

View reviewed changes

snuba/datasets/storage.py Outdated Show resolved Hide resolved

lynnagara reviewed Nov 5, 2022

View reviewed changes

tests/datasets/test_table_storage.py Outdated Show resolved Hide resolved

Riya Chakraborty added 3 commits November 7, 2022 08:33

Add test for get_physical_topic_name

8c8425d

Move validation to partitioning file

b8c25f5

Fix broken test (get_physical_topic_name)

e9d52f6

lynnagara reviewed Nov 8, 2022

View reviewed changes

snuba/consumers/consumer_builder.py Show resolved Hide resolved

Remove defaults

db7e972

nikhars approved these changes Nov 10, 2022

View reviewed changes

Riya Chakraborty and others added 3 commits November 10, 2022 11:12

Fix mypy issue

9f06bba

Add test

2fe8679

config(slicing) - add slice ID to emitted metrics tags (#3374)

dde5c5a

ayirr7 enabled auto-merge (squash) November 14, 2022 18:08

ayirr7 requested a review from lynnagara November 14, 2022 18:09

ArroyoTopic --> Topic

9003367

lynnagara approved these changes Nov 14, 2022

View reviewed changes

ayirr7 merged commit 3603555 into master Nov 14, 2022

ayirr7 deleted the sliced-consumers-cli branch November 14, 2022 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(slices): Make consumers "slice-aware" #3259

feat(slices): Make consumers "slice-aware" #3259

ayirr7 commented Oct 14, 2022 •

edited by MeredithAnya

Loading

codecov-commenter commented Oct 14, 2022 •

edited

Loading

onewland commented Oct 14, 2022 •

edited

Loading

ayirr7 commented Oct 14, 2022 •

edited

Loading

onewland left a comment

ayirr7 commented Oct 20, 2022 •

edited

Loading

onewland left a comment •

edited

Loading

lynnagara left a comment

lynnagara Oct 31, 2022

ayirr7 Nov 1, 2022

lynnagara Nov 4, 2022

ayirr7 Nov 5, 2022

lynnagara Nov 5, 2022 •

edited

Loading

lynnagara Nov 14, 2022

nikhars left a comment

feat(slices): Make consumers "slice-aware" #3259

feat(slices): Make consumers "slice-aware" #3259

Conversation

ayirr7 commented Oct 14, 2022 • edited by MeredithAnya Loading

codecov-commenter commented Oct 14, 2022 • edited Loading

Codecov Report

onewland commented Oct 14, 2022 • edited Loading

ayirr7 commented Oct 14, 2022 • edited Loading

onewland left a comment

Choose a reason for hiding this comment

ayirr7 commented Oct 20, 2022 • edited Loading

onewland left a comment • edited Loading

Choose a reason for hiding this comment

lynnagara left a comment

Choose a reason for hiding this comment

lynnagara Oct 31, 2022

Choose a reason for hiding this comment

ayirr7 Nov 1, 2022

Choose a reason for hiding this comment

lynnagara Nov 4, 2022

Choose a reason for hiding this comment

ayirr7 Nov 5, 2022

Choose a reason for hiding this comment

lynnagara Nov 5, 2022 • edited Loading

Choose a reason for hiding this comment

lynnagara Nov 14, 2022

Choose a reason for hiding this comment

nikhars left a comment

Choose a reason for hiding this comment

ayirr7 commented Oct 14, 2022 •

edited by MeredithAnya

Loading

codecov-commenter commented Oct 14, 2022 •

edited

Loading

onewland commented Oct 14, 2022 •

edited

Loading

ayirr7 commented Oct 14, 2022 •

edited

Loading

ayirr7 commented Oct 20, 2022 •

edited

Loading

onewland left a comment •

edited

Loading

lynnagara Nov 5, 2022 •

edited

Loading