feat(replays): initial replays clickhouse migration #2681

JoshFerge · 2022-05-09T19:16:32Z

Summary

Creates the initial clickhouse migration for the replays dataset. Rows be uniquely added by their replay_id (many rows will have same replay_id) along with a monotonically increasing sequence_id that represents each additional piece of data for the replay.

This table will be used initially for simply searching and listing replays on the replays index page. basic aggregations will be done to gather the duration of the replay, and the list of trace_ids associated with the replay to start.

further columns will likely be introduced as product requirements determine the types of searches we want to do.

See the spec here for more information. https://www.notion.so/sentry/Session-Replay-V1-alpha-Ingest-Backend-ae068d1e1d514221b6c3ea2233f360f4

github-actions · 2022-05-09T19:47:22Z

This PR has a migration; here is the generated SQL

-- start migrations

-- migration replays : 0001_replays
Local operations:
CREATE TABLE IF NOT EXISTS replays_local (replay_id UUID, sequence_id UInt16, trace_ids Array(UUID), _trace_ids_hashed UInt64 MATERIALIZED arrayMap(t -> cityHash64(t), trace_ids), title String, project_id UInt64, timestamp DateTime, platform LowCardinality(String), environment LowCardinality(Nullable(String)), release Nullable(String), dist Nullable(String), ip_address_v4 Nullable(IPv4), ip_address_v6 Nullable(IPv6), user String, user_hash UInt64, user_id Nullable(String), user_name Nullable(String), user_email Nullable(String), sdk_name String, sdk_version String, tags Nested(key String, value String), retention_days UInt16, partition UInt16, offset UInt64) ENGINE ReplicatedReplacingMergeTree('/clickhouse/tables/replays/{shard}/default/replays_local', '{replica}') ORDER BY (project_id, toStartOfDay(timestamp), cityHash64(replay_id), sequence_id) PARTITION BY (retention_days, toMonday(timestamp)) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192;
ALTER TABLE replays_local ADD INDEX IF NOT EXISTS bf_trace_ids_hashed _trace_ids_hashed TYPE bloom_filter() GRANULARITY 1;


Dist operations:
CREATE TABLE IF NOT EXISTS replays_dist (replay_id UUID, sequence_id UInt16, trace_ids Array(UUID), _trace_ids_hashed UInt64 MATERIALIZED arrayMap(t -> cityHash64(t), trace_ids), title String, project_id UInt64, timestamp DateTime, platform LowCardinality(String), environment LowCardinality(Nullable(String)), release Nullable(String), dist Nullable(String), ip_address_v4 Nullable(IPv4), ip_address_v6 Nullable(IPv6), user String, user_hash UInt64, user_id Nullable(String), user_name Nullable(String), user_email Nullable(String), sdk_name String, sdk_version String, tags Nested(key String, value String), retention_days UInt16, partition UInt16, offset UInt64) ENGINE Distributed(cluster_one_sh, default, replays_local, cityHash64(toString(replay_id)));
-- end migration replays : 0001_replays

lynnagara · 2022-05-09T21:10:55Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+
+raw_columns: Sequence[Column[Modifiers]] = [
+    Column("replay_id", UUID()),
+    Column("sequence_id", UInt(16)),


Who generates the sequence_id? On the SDK? Sentry? Snuba? What's the max number allowed here?

the SDK will generate the sequence id. max number will be between ~100 and ~1000. (we will be capping replays max length time-wise and from there find a sane max seq_id).

lynnagara · 2022-05-09T21:11:48Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+                columns=raw_columns,
+                engine=table_engines.ReplacingMergeTree(
+                    storage_set=StorageSetKey.REPLAYS,
+                    order_by="(project_id, toStartOfDay(timestamp), cityHash64(replay_id), sequence_id)",


Just to confirm, items with the same replay_id can still span multiple days right?

yes. i'm glad you brought this up. the intention is that: replays which span across multiple days will only show up when the initial event is within the time range.

Probably an edge case, but if we do receive replays on different days with the same replay_id and sequence_id they will not get merged together and we'd need a strategy to deduplicate them when querying.

@JoshFerge Could you please provide some insights of the most common query pattern you expect?
What are you going to filter by most times?
What are you going to aggregate, if anything ?

The order by key has to be defined based on the expected query pattern. You cannot change it once done without rebuilding the table entirely and getting it wrong will make your query performance miserable.
Also the expected query pattern impacts which (if any) data skipping indexes should be added. We cannot add indexes to all columns as the type of index depend on the query you want to make faster.

https://www.notion.so/sentry/Addendum-Replay-Queries-fcfd8e68679e443e87649014cf10ae62
see above ^^. We will be happy to rebuild the table entirely while we are testing for the next several months, so viewing all data as temporary and will make that clear to any customers testing. we will be building several use cases on top of this initial that may require us to re-build the tables at any rate.

Re: the document you linked:

"As a replays user, I want to see all replays where an error occurred"
The table designed here does not seem to have a reference to an issue or an error. Is that correct or a mistake ?

"As a performance user, I want to see if this trace has a replay associated with it"
If you want to search the replays table WHERE has(trace_ids, 'asdasdasdasd') please add a bloom filter index on that column (which may require you to create a materialized version with hashes of that column). Otherwise your search will be miserable.
But it would be better to add a replay id on the transaction in some way so you do not have to scan the whole replay table to associate replays to traces.

The table designed here does not seem to have a reference to an issue or an error. Is that correct or a mistake ?

Not a mistake. For now, we will do very rudimentary query where we take the trace ids from a single page and look them up to determine if there is an error associated, or do a search on the errors table looking for the replays tag.

The issue is that since errors can be sampled / dropped, (and replays too in the future), tagging each other's events with the ids is problematic because it's not guaranteed that the tagged id will exist.

we'll likely need some separate table that's generated in event post_processing that can accurately associate ingested events with replays. this will come in a future iteration.

But it would be better to add a replay id on the transaction in some way so you do not have to scan the whole replay table to associate replays to traces.

we'll also be adding replay_id on other events, so for example this search can use transactions tagged with a replay id. (there is still the sampling problem, but not going to worry about this in first iteration)

and for now will not add bloom filter index, will add TODO. something I can follow up on.

and for now will not add bloom filter index, will add TODO. something I can follow up on.

Please do not wait on this. Not doing that means a full table scan each time and the effort to add the index is minimal. Clickhouse tables get large very quickly.

went ahead and added the index 👍🏼

lynnagara · 2022-05-09T21:15:35Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+raw_columns: Sequence[Column[Modifiers]] = [
+    Column("replay_id", UUID()),
+    Column("sequence_id", UInt(16)),
+    Column("trace_ids", Array(UUID())),


What are these trace_ids? Is it supposed to be a pointer to some other piece of data in one of our systems?

getsentry/sentry-replay#38 (comment)

replays can have N trace_ids, and each update may have N of them, and trace_id will be the link between them to start (we won't be doing any joins with them)

Did you ever get a sense of how many trace IDs could be in this field?

on a per row basis it likely won't be more than 10

nikhars

Is there a document I can read on the use cases replays is trying to solve? More specifically what sort of queries would be run against this dataset.

nikhars · 2022-05-09T21:22:14Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+
+raw_columns: Sequence[Column[Modifiers]] = [
+    Column("replay_id", UUID()),
+    Column("sequence_id", UInt(16)),


Is there any sort of relation between sequence_id and replay_id field?

sequence_id will be a monotonically increasing counter, so for each replay_id, sequence_id is unique.

nikhars · 2022-05-09T21:23:50Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+    # sdk info
+    Column("sdk_name", String()),
+    Column("sdk_version", String()),
+    Column("tags", Nested([("key", String()), ("value", String())])),


For performance reasons, you might want to add bloom filter index on tags as we do on some of our other datasets

good call 👍🏼 will look at adding those.

Are you actually going to search for replays by tag key/value ?

nikhars · 2022-05-09T21:24:53Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+                columns=raw_columns,
+                engine=table_engines.Distributed(
+                    local_table_name="replays_local",
+                    sharding_key="project_id",


What is the reason for sharding the data by project_id versus sharding randomly? One disadvantage I can see with sharding by `project_id is that if there is a big project which uses replays a lot, the shards could become imbalanced.

good point, I think i just chose this arbitrarily. I'll shard by replay_id instead.

JoshFerge · 2022-05-10T00:26:27Z

Is there a document I can read on the use cases replays is trying to solve? More specifically what sort of queries would be run against this dataset.
I've added an addendum to the doc here:
https://www.notion.so/sentry/Addendum-Replay-Queries-fcfd8e68679e443e87649014cf10ae62

codecov-commenter · 2022-05-10T00:33:23Z

Codecov Report

Merging #2681 (3385cda) into master (2a61e5c) will decrease coverage by 0.04%.
The diff coverage is 95.69%.

❗ Current head 3385cda differs from pull request most recent head d0e5aef. Consider uploading reports for the commit d0e5aef to get more accurate results

@@            Coverage Diff             @@
##           master    #2681      +/-   ##
==========================================
- Coverage   92.81%   92.77%   -0.05%     
==========================================
  Files         609      612       +3     
  Lines       28606    28662      +56     
==========================================
+ Hits        26552    26591      +39     
- Misses       2054     2071      +17

Impacted Files	Coverage Δ
snuba/cli/subscriptions_executor.py	`52.63% <0.00%> (+0.08%)`	⬆️
snuba/consumers/consumer.py	`84.92% <ø> (ø)`
snuba/settings/settings_distributed.py	`0.00% <ø> (-100.00%)`	⬇️
...uba/subscriptions/scheduler_processing_strategy.py	`90.95% <ø> (ø)`
snuba/utils/serializable_exception.py	`97.91% <ø> (ø)`
snuba/utils/streams/encoding.py	`86.95% <ø> (ø)`
.../subscriptions/test_combined_scheduler_executor.py	`100.00% <ø> (ø)`
tests/subscriptions/test_scheduler_consumer.py	`100.00% <ø> (ø)`
...ubscriptions/test_scheduler_processing_strategy.py	`100.00% <ø> (ø)`
tests/test_consumer.py	`84.45% <ø> (ø)`
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a61e5c...d0e5aef. Read the comment docs.

fpacifici · 2022-05-10T16:49:56Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+    # sdk info
+    Column("sdk_name", String()),
+    Column("sdk_version", String()),
+    Column("tags", Nested([("key", String()), ("value", String())])),


Are you actually going to search for replays by tag key/value ?

fpacifici · 2022-05-10T16:54:41Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+                storage_set=StorageSetKey.REPLAYS,
+                table_name="replays_local",
+                columns=raw_columns,
+                engine=table_engines.ReplacingMergeTree(


Are you going to use the replacing feature for something (aside for removing duplicates, which is anyway a good idea) ?

no, just removing duplicates.

fpacifici · 2022-05-10T16:57:06Z

snuba/migrations/snuba_migrations/replays/0001_replays.py

+                columns=raw_columns,
+                engine=table_engines.ReplacingMergeTree(
+                    storage_set=StorageSetKey.REPLAYS,
+                    order_by="(project_id, toStartOfDay(timestamp), cityHash64(replay_id), sequence_id)",


@JoshFerge Could you please provide some insights of the most common query pattern you expect?
What are you going to filter by most times?
What are you going to aggregate, if anything ?

The order by key has to be defined based on the expected query pattern. You cannot change it once done without rebuilding the table entirely and getting it wrong will make your query performance miserable.
Also the expected query pattern impacts which (if any) data skipping indexes should be added. We cannot add indexes to all columns as the type of index depend on the query you want to make faster.

fpacifici

Please fix the type errors

JoshFerge · 2022-06-10T08:28:14Z

Please fix the type errors

fixed, accidentally included an errant file which caused the errors.

JoshFerge · 2022-06-10T22:40:16Z

will merge monday. thanks for reviews all.

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.12 to 2.2.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/urllib3/urllib3/releases">urllib3's releases</a>.</em></p> <blockquote> <h2>2.2.2</h2> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support for 2023. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Changes</h2> <ul> <li>Added the <code>Proxy-Authorization</code> header to the list of headers to strip from requests when redirecting to a different host. As before, different headers can be set via <code>Retry.remove_headers_on_redirect</code>.</li> <li>Allowed passing negative integers as <code>amt</code> to read methods of <code>http.client.HTTPResponse</code> as an alternative to <code>None</code>. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3122">#3122</a>)</li> <li>Fixed return types representing copying actions to use <code>typing.Self</code>. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3363">#3363</a>)</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/urllib3/urllib3/compare/2.2.1...2.2.2">https://github.com/urllib3/urllib3/compare/2.2.1...2.2.2</a></p> <h2>2.2.1</h2> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support for 2023. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Changes</h2> <ul> <li>Fixed issue where <code>InsecureRequestWarning</code> was emitted for HTTPS connections when using Emscripten. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3331">#3331</a>)</li> <li>Fixed <code>HTTPConnectionPool.urlopen</code> to stop automatically casting non-proxy headers to <code>HTTPHeaderDict</code>. This change was premature as it did not apply to proxy headers and <code>HTTPHeaderDict</code> does not handle byte header values correctly yet. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3343">#3343</a>)</li> <li>Changed <code>ProtocolError</code> to <code>InvalidChunkLength</code> when response terminates before the chunk length is sent. (<a href="https://redirect.github.com/urllib3/urllib3/issues/2860">#2860</a>)</li> <li>Changed <code>ProtocolError</code> to be more verbose on incomplete reads with excess content. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3261">#3261</a>)</li> </ul> <h2>2.2.0</h2> <h2>🖥️ urllib3 now works in the browser</h2> <p>:tada: <strong>This release adds experimental support for <a href="https://urllib3.readthedocs.io/en/stable/reference/contrib/emscripten.html">using urllib3 in the browser with Pyodide</a>!</strong> 🎉</p> <p>Thanks to Joe Marshall (<a href="https://github.com/joemarshall"><code>@joemarshall</code></a>) for contributing this feature. This change was possible thanks to work done in urllib3 v2.0 to detach our API from <code>http.client</code>. Please report all bugs to the <a href="https://github.com/urllib3/urllib3/issues">urllib3 issue tracker</a>.</p> <h2>🚀 urllib3 is fundraising for HTTP/2 support</h2> <p><a href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3 is raising ~$40,000 USD</a> to release HTTP/2 support and ensure long-term sustainable maintenance of the project after a sharp decline in financial support for 2023. If your company or organization uses Python and would benefit from HTTP/2 support in Requests, pip, cloud SDKs, and thousands of other projects <a href="https://opencollective.com/urllib3">please consider contributing financially</a> to ensure HTTP/2 support is developed sustainably and maintained for the long-haul.</p> <p>Thank you for your support.</p> <h2>Changes</h2> <ul> <li>Added support for <a href="https://urllib3.readthedocs.io/en/latest/reference/contrib/emscripten.html">Emscripten and Pyodide</a>, including streaming support in cross-origin isolated browser environments where threading is enabled. (<a href="https://redirect.github.com/urllib3/urllib3/issues/2951">#2951</a>)</li> <li>Added support for <code>HTTPResponse.read1()</code> method. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3186">#3186</a>)</li> <li>Added rudimentary support for HTTP/2. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3284">#3284</a>)</li> <li>Fixed issue where requests against urls with trailing dots were failing due to SSL errors when using proxy. (<a href="https://redirect.github.com/urllib3/urllib3/issues/2244">#2244</a>)</li> <li>Fixed <code>HTTPConnection.proxy_is_verified</code> and <code>HTTPSConnection.proxy_is_verified</code> to be always set to a boolean after connecting to a proxy. It could be <code>None</code> in some cases previously. (<a href="https://redirect.github.com/urllib3/urllib3/issues/3130">#3130</a>)</li> </ul>  </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's changelog</a>.</em></p> <blockquote> <h1>2.2.2 (2024-06-17)</h1> <ul> <li>Added the <code>Proxy-Authorization</code> header to the list of headers to strip from requests when redirecting to a different host. As before, different headers can be set via <code>Retry.remove_headers_on_redirect</code>.</li> <li>Allowed passing negative integers as <code>amt</code> to read methods of <code>http.client.HTTPResponse</code> as an alternative to <code>None</code>. (<code>[#3122](urllib3/urllib3#3122) <https://github.com/urllib3/urllib3/issues/3122></code>__)</li> <li>Fixed return types representing copying actions to use <code>typing.Self</code>. (<code>[#3363](urllib3/urllib3#3363) <https://github.com/urllib3/urllib3/issues/3363></code>__)</li> </ul> <h1>2.2.1 (2024-02-16)</h1> <ul> <li>Fixed issue where <code>InsecureRequestWarning</code> was emitted for HTTPS connections when using Emscripten. (<code>[#3331](urllib3/urllib3#3331) <https://github.com/urllib3/urllib3/issues/3331></code>__)</li> <li>Fixed <code>HTTPConnectionPool.urlopen</code> to stop automatically casting non-proxy headers to <code>HTTPHeaderDict</code>. This change was premature as it did not apply to proxy headers and <code>HTTPHeaderDict</code> does not handle byte header values correctly yet. (<code>[#3343](urllib3/urllib3#3343) <https://github.com/urllib3/urllib3/issues/3343></code>__)</li> <li>Changed <code>InvalidChunkLength</code> to <code>ProtocolError</code> when response terminates before the chunk length is sent. (<code>[#2860](urllib3/urllib3#2860) <https://github.com/urllib3/urllib3/issues/2860></code>__)</li> <li>Changed <code>ProtocolError</code> to be more verbose on incomplete reads with excess content. (<code>[#3261](urllib3/urllib3#3261) <https://github.com/urllib3/urllib3/issues/3261></code>__)</li> </ul> <h1>2.2.0 (2024-01-30)</h1> <ul> <li>Added support for <code>Emscripten and Pyodide <https://urllib3.readthedocs.io/en/latest/reference/contrib/emscripten.html></code><strong>, including streaming support in cross-origin isolated browser environments where threading is enabled. (<code>[#2951](urllib3/urllib3#2951) <https://github.com/urllib3/urllib3/issues/2951></code></strong>)</li> <li>Added support for <code>HTTPResponse.read1()</code> method. (<code>[#3186](urllib3/urllib3#3186) <https://github.com/urllib3/urllib3/issues/3186></code>__)</li> <li>Added rudimentary support for HTTP/2. (<code>[#3284](urllib3/urllib3#3284) <https://github.com/urllib3/urllib3/issues/3284></code>__)</li> <li>Fixed issue where requests against urls with trailing dots were failing due to SSL errors when using proxy. (<code>[#2244](urllib3/urllib3#2244) <https://github.com/urllib3/urllib3/issues/2244></code>__)</li> <li>Fixed <code>HTTPConnection.proxy_is_verified</code> and <code>HTTPSConnection.proxy_is_verified</code> to be always set to a boolean after connecting to a proxy. It could be <code>None</code> in some cases previously. (<code>[#3130](urllib3/urllib3#3130) <https://github.com/urllib3/urllib3/issues/3130></code>__)</li> <li>Fixed an issue where <code>headers</code> passed in a request with <code>json=</code> would be mutated (<code>[#3203](urllib3/urllib3#3203) <https://github.com/urllib3/urllib3/issues/3203></code>__)</li> <li>Fixed <code>HTTPSConnection.is_verified</code> to be set to <code>False</code> when connecting from a HTTPS proxy to an HTTP target. It was set to <code>True</code> previously. (<code>[#3267](urllib3/urllib3#3267) <https://github.com/urllib3/urllib3/issues/3267></code>__)</li> <li>Fixed handling of new error message from OpenSSL 3.2.0 when configuring an HTTP proxy as HTTPS (<code>[#3268](urllib3/urllib3#3268) <https://github.com/urllib3/urllib3/issues/3268></code>__)</li> <li>Fixed TLS 1.3 post-handshake auth when the server certificate validation is disabled (<code>[#3325](urllib3/urllib3#3325) <https://github.com/urllib3/urllib3/issues/3325></code>__)</li> <li>Note for downstream distributors: To run integration tests, you now need to run the tests a second time with the <code>--integration</code> pytest flag. (<code>[#3181](urllib3/urllib3#3181) <https://github.com/urllib3/urllib3/issues/3181></code>__)</li> </ul> <h1>2.1.0 (2023-11-13)</h1> <ul> <li>Removed support for the deprecated urllib3[secure] extra. (<code>[#2680](urllib3/urllib3#2680) <https://github.com/urllib3/urllib3/issues/2680></code>__)</li> <li>Removed support for the deprecated SecureTransport TLS implementation. (<code>[#2681](urllib3/urllib3#2681) <https://github.com/urllib3/urllib3/issues/2681></code>__)</li> <li>Removed support for the end-of-life Python 3.7. (<code>[#3143](urllib3/urllib3#3143) <https://github.com/urllib3/urllib3/issues/3143></code>__)</li> <li>Allowed loading CA certificates from memory for proxies. (<code>[#3065](urllib3/urllib3#3065) <https://github.com/urllib3/urllib3/issues/3065></code>__)</li> <li>Fixed decoding Gzip-encoded responses which specified <code>x-gzip</code> content-encoding. (<code>[#3174](urllib3/urllib3#3174) <https://github.com/urllib3/urllib3/issues/3174></code>__)</li> </ul> <h1>2.0.7 (2023-10-17)</h1> <ul> <li>Made body stripped from HTTP requests changing the request method to GET after HTTP 303 "See Other" redirect responses.</li> </ul>  </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/urllib3/urllib3/commit/27e2a5c5a7ab6a517252cc8dcef3ffa6ffb8f61a"><code>27e2a5c</code></a> Release 2.2.2 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3406">#3406</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/accff72ecc2f6cf5a76d9570198a93ac7c90270e"><code>accff72</code></a> Merge pull request from GHSA-34jh-p97f-mpxf</li> <li><a href="https://github.com/urllib3/urllib3/commit/34be4a57e59eb7365bcc37d52e9f8271b5b8d0d3"><code>34be4a5</code></a> Pin CFFI to a new release candidate instead of a Git commit (<a href="https://redirect.github.com/urllib3/urllib3/issues/3398">#3398</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/da410581b6b3df73da976b5ce5eb20a4bd030437"><code>da41058</code></a> Bump browser-actions/setup-chrome from 1.6.0 to 1.7.1 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3399">#3399</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/b07a669bd970d69847801148286b726f0570b625"><code>b07a669</code></a> Bump github/codeql-action from 2.13.4 to 3.25.6 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3396">#3396</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/b8589ec9f8c4da91511e601b632ac06af7e7c10e"><code>b8589ec</code></a> Measure coverage with v4 of artifact actions (<a href="https://redirect.github.com/urllib3/urllib3/issues/3394">#3394</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/f3bdc5585111429e22c81b5fb26c3ec164d98b81"><code>f3bdc55</code></a> Allow triggering CI manually (<a href="https://redirect.github.com/urllib3/urllib3/issues/3391">#3391</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/52392654b30183129cf3ec06010306f517d9c146"><code>5239265</code></a> Fix HTTP version in debug log (<a href="https://redirect.github.com/urllib3/urllib3/issues/3316">#3316</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/b34619f94ece0c40e691a5aaf1304953d88089de"><code>b34619f</code></a> Bump actions/checkout to 4.1.4 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3387">#3387</a>)</li> <li><a href="https://github.com/urllib3/urllib3/commit/9961d14de7c920091d42d42ed76d5d479b80064d"><code>9961d14</code></a> Bump browser-actions/setup-chrome from 1.5.0 to 1.6.0 (<a href="https://redirect.github.com/urllib3/urllib3/issues/3386">#3386</a>)</li> <li>Additional commits viewable in <a href="https://github.com/urllib3/urllib3/compare/1.26.12...2.2.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3&package-manager=pip&previous-version=1.26.12&new-version=2.2.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) You can trigger a rebase of this PR by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/getsentry/snuba/network/alerts). </details> > **Note** > Automatic rebases have been disabled on this pull request as it has been open for over 30 days. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Onkar Deshpande <onkar.deshpande@sentry.io>

JoshFerge force-pushed the jferg/replays-snuba-1 branch 3 times, most recently from 1ddbd25 to 599bbbe Compare May 9, 2022 19:42

replays storage

203aeee

JoshFerge force-pushed the jferg/replays-snuba-1 branch from 599bbbe to 203aeee Compare May 9, 2022 19:45

JoshFerge marked this pull request as ready for review May 9, 2022 20:02

JoshFerge requested a review from a team as a code owner May 9, 2022 20:02

JoshFerge requested a review from a team May 9, 2022 20:03

lynnagara reviewed May 9, 2022

View reviewed changes

nikhars reviewed May 9, 2022

View reviewed changes

change sharding key to hashed replay id

72dea41

JoshFerge mentioned this pull request May 10, 2022

feat(replays): replays processor to ingest data into clickhouse #2683

Closed

Merge branch 'master' into jferg/replays-snuba-1

8dcbcee

fpacifici reviewed May 10, 2022

View reviewed changes

merge

5fe6dfd

JoshFerge requested review from fpacifici, lynnagara, nikhars and evanh June 7, 2022 19:57

add todo for bloom filter

b2b4498

fpacifici reviewed Jun 10, 2022

View reviewed changes

JoshFerge added 2 commits June 10, 2022 00:48

remove errant file

613e4db

add materialized hash column and bloom filter index

d0e5aef

JoshFerge requested a review from fpacifici June 10, 2022 08:28

fpacifici approved these changes Jun 10, 2022

View reviewed changes

JoshFerge merged commit 24bc476 into master Jun 13, 2022

JoshFerge deleted the jferg/replays-snuba-1 branch June 13, 2022 17:28

JoshFerge mentioned this pull request Jun 13, 2022

feat(replays): replays processor to ingest data into clickhouse #2806

Merged

feat(replays): initial replays clickhouse migration #2681

feat(replays): initial replays clickhouse migration #2681

Conversation

JoshFerge commented May 9, 2022

Summary

github-actions bot commented May 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpacifici Jun 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikhars left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JoshFerge commented May 10, 2022

codecov-commenter commented May 10, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpacifici left a comment

Choose a reason for hiding this comment

JoshFerge commented Jun 10, 2022

JoshFerge commented Jun 10, 2022

github-actions bot commented May 9, 2022 •

edited

Loading

fpacifici Jun 10, 2022 •

edited

Loading

codecov-commenter commented May 10, 2022 •

edited

Loading