Skip to content

[sc-25841] Remove 4MB limit for gRPC message payloads #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 30, 2023

Conversation

justinGilmer
Copy link

Currently our logic chunks data to send over gRPC according to numbers of rows of data, this led to large streamsets in the multivalue api erroring out while trying to recv this data on the client side.

This PR sets the limit for the client to receive from the server to be unlimited for the time being. This will allow arbitrarily-sized streamsets to have successful multivalue queries.

In the future we should update our logic to better handle these size limits when sending from the server, but this is a patch fix for now.

I tested this for 5000 streams on ni4ai, with a sampling_frequency of 30 for 100seconds of data, this would error out on the master branch, but successfully returns the data on with this PR.

MWE

import btrdb

conn = btrdb.connect(profile='CLUSTER_WITH_MANY_STREAMS')
streams = conn.streams_in_collection(*conn.list_collections(), is_collection_prefix=False)

end = btrdb.utils.timez.currently_as_ns()
start = end - btrdb.utils.timez.ns_delta(seconds=100)

strm_set = btrdb.stream.StreamSet(streams).filter(start=start, end=end, sampling_frequency=30)

table = strm_set.arrow_values()

Currently our logic chunks data to send over gRPC according to numbers
of rows of data, this led to large streamsets in the multivalue api
erroring out while trying to recv this data.

This PR sets the limit for the client to receive from the server to be
unlimited for the time being. This will allow arbitrarly-sized
streamsets to have successful multivalue queries.

In the future we should update our logic to better handle these size
limits when sending from the server, but this is a patch fix for now.
@justinGilmer justinGilmer self-assigned this Aug 29, 2023
@justinGilmer justinGilmer changed the title Remove 4MB limit for gRPC message payloads [sc-25841] Remove 4MB limit for gRPC message payloads Aug 29, 2023
@justinGilmer
Copy link
Author

trying to link ticket

[sc-25841]

@shortcut-integration
Copy link

Copy link

@jleifnf jleifnf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Wonder how the no limit on the size to send would effect for the local querying of the data.

@justinGilmer
Copy link
Author

I think there will still be a bit of a limit:

But, there will still be a upper bound since we currently enforce only 5000 rows to be sent at a time per message, so we can technically have a message size of: 5000points * 8bytes/point (assuming 64bit) * N streams + 1 time stream
My rough estimates (13K streamset ~500MB, 2.5K streamset ~100MB)

We have a similar limit on the send to the server side when inserting data as well.

@andrewchambers
Copy link

andrewchambers commented Aug 29, 2023

Our server limits things itself, if we trust our server then it shouldn't be a problem.

@justinGilmer justinGilmer merged commit 334ed89 into master Aug 30, 2023
@justinGilmer justinGilmer deleted the sc-25841 branch August 30, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants