Skip to content

Small streaming sync improvements #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 10, 2025
Merged

Small streaming sync improvements #242

merged 3 commits into from
Feb 10, 2025

Conversation

simolus3
Copy link
Contributor

@simolus3 simolus3 commented Feb 7, 2025

Most of these changes are from my work on bucket priorities, I'm moving some of the generic changes into this PR so that the priorities PR will be easier to review. I also wanted to take a look at grouping multiple data lines received from the sync service into a single transaction, but that will only have an impact on many small buckets.
So, this:

  1. Introduces a common sealed class for all lines we might receive from the server instead of using a generic Object class to represent sync lines.
  2. If there are many buckets to synchronize with a few items in them only, it's possible that we receive multiple data lines in quick succession. This PR implements a small optimization to insert them in a single transaction when that happens (previously we'd use one transaction per bucket). This may slightly improve sync times when many tiny buckets are involved.
  3. Sets up a test that can test the streaming sync implementation on the web too. For simplicity, we're keeping everything in a single isolate there (and avoid web workers). This allows running a mocked sync service through an in-memory channel, which is much easier to control than a hybridMain setup spawning a proper socket server.

Copy link
Contributor

@rkistner rkistner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't confirm the specific EventSink details, but overall the implementation here looks good.

Two notes:

  1. Could you limit the size of batches (by number of operations probably)? To avoid keeping too much data into memory if it comes in at a high rate.
  2. Related to the above, how is back-pressure handled, if the server is sending data faster than the client can persist?

I understand in this case we need to handle back-pressure differently, since we actually need to "see" further ahead in the queue. But I don't fully understand how back-pressure works/could work with the EventSink implementation.

@simolus3
Copy link
Contributor Author

simolus3 commented Feb 7, 2025

Could you limit the size of batches (by number of operations probably)?

I've disabled the internal buffer for batches that contain more than 100 operations and I'm also flushing it once it reaches a size of 1000 operations. Since this is most effective for the "many buckets with few operations" case, I think this matches what we want.

Related to the above, how is back-pressure handled, if the server is sending data faster than the client can persist?

Stream.eventTransformed immediately forwards pause and resume events to the source stream subscription. Every suspension in an await for loop triggers a pause, which should propagate all the way to the response stream. A small difference is that, since we can buffer a chunk internally, it's possible that the downstream listener pauses the subscription (which is forwarded to the http client), and then we still emit the pending chunk. That's not a problem though since it just moves the place where the chunk is buffered from the event sink to the listener - it's buffered either way since we're already received the data at that point.

@rkistner
Copy link
Contributor

Great, that makes sense.

@simolus3 simolus3 merged commit 1f9268f into main Feb 10, 2025
4 checks passed
@simolus3 simolus3 deleted the feat/group-sync-lines branch February 10, 2025 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants