-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate label compression #5870
Comments
Did a small experiment on the querier's side with a typical user query that we get here. Only a very small set of hostnames and pod names are unique, all of the other values repeat more than once. Some strings repeat even 12k, 8k times 😱 Added a total sum of all strings in that query and the sum of all strings only once. Here are some results:
(https://gist.github.com/GiedriusS/aada711443326ea452ec5c4c0c508b07) In this test, chunks data takes up around ~130KB. This means, that for each StoreAPI, we can save around ~97% of traffic used just for sending labels. In total, reduction would be around 90%. I have tested with an instant query. I guess that with a range query and constant labels this gain would be smaller because more chunks data would need to be sent. Only caveat that I can think of is that we either won't be able to have a streaming Select() or there would be more round-trips if we were to stream the lookup table gradually as it is built up. |
@GiedriusS I think this also applies to the Query Frontend side. Edit: actualy seems we are not using the gRPC query API now. It is still the HTTP API. 😢 I am wondering if we have any plan to use that gRPC query API, right now it is not used anywhere |
Dictionary encoding would definitely be an awesome addition. One thing that might not be as straightforward is proxying encoded series through a querier that is used as a gRPC proxy. The simples way would be to decode and re-encode all series in this middleware querier using a union of all received maps. But that will require blocking and buffering. Maybe there is a way to make the encoding composable so that only the root querier has to do the decoding. Regarding the gRPC Query API, I added this because I thought it will be needed for pushdown and sharding. But in the end, we were able to do everything using the existing HTTP API. It would be nice to start using the gRPC one, but if the migration is too complex or if the API is hard to maintain, I would also be okay with deprecating it. |
Yeah, I was thinking that if there was a function like:
That would work identically across multiple StoreAPIs, we could compose responses from multiple StoreAPIs. An ordinary hashing function would work but the hashes could become longer than the original strings. Perhaps the function could leverage the fact that label names/values have strict requirements i.e. it's not full Unicode space. I have tried googling for a bit but haven't seen any good function that could work here. |
Is your proposal related to a problem?
We currently send label names/values are bare strings in each Series() call. Even with gRPC compression turned on, I think compression compresses each streamed response but not all of it:
(https://grpc.github.io/grpc/core/md_doc_compression.html)
Parca uses deduplicated string table for compression parca-dev/parca#1976 and got good results so perhaps we could use the same idea. Also, Prometheus TSDB has the same idea: it interns all strings so they are not repeated: https://github.com/prometheus/prometheus/blob/main/tsdb/docs/format/index.md#symbol-table.
Describe the solution you'd like
Create a lookup table while sending back Series responses. Send the lookup table to the client at the end.
Note, that because this kind of compression applies to the whole response, it means that it would be no longer possible to make a fully streamed Select() call for the PromQL engine. An alternative would be not to send it at the end of the whole stream but then there would be more roundtrips.
Describe alternatives you've considered
N/A
Additional context
https://cloud-native.slack.com/archives/CL25937SP/p1667349702590129
The text was updated successfully, but these errors were encountered: