Skip to content

Conversation

@gefjon
Copy link
Contributor

@gefjon gefjon commented May 26, 2025

Description of Changes

and after releasing the datastore lock.

This commit moves some amount of work off the main eval_updates_sequential task and into a new background worker, the send_worker. The actual work performed is unchanged, it's just that some of it is moved to a different function which runs asynchronously after releasing the datastore lock. We expect two benefits from this:

  1. Most obviously, we're now doing less work while holding the lock, reducing the size of our critical section.
  2. We suspect that we had a thundering herd problem where we wake each client's WebSocket worker just before waking the next reducer task, making it wait in the Tokio task queue until all the client messages have been sent. This change should cause the reducer task to wake first, improving transaction throughput.

Moving the aggregation and broadcast to a separate task required enclosing the SubscriptionManager's client map in an Arc<RwLock>. There's some amount of tricky code in there, with comments, that has to go out of its way to avoid deadlocks.

I also chose to change the table_name to be an Arc<str> rather than a Box<str>, as it needed to be cloned from eval_updates_sequential into the send_worker, where previously it was just borrowed within eval_updates_sequential. We could further reduce cloning by threading the change from Box to Arc further through our code, but I chose to keep the change relatively contained within the subscription-related code.

Additionally, while reworking the SubscriptionManger's code, I changed from using tuples to structs in a few places. This leaked out a bit, and I changed some methods which used to take (args: (T, U)) to instead take arg_t: T, arg_u: U.
The only callers outside of module_subscription_manager.rs (the ones I was changing anyways) were constructing the tuple in the call anyways, and so now look cleaner.

Many tests are changed to now require a Tokio runtime, so that the SubscriptionManager can tokio::spawn its Self::send_worker future. Some tests which previously had a single-threaded runtime now require a multi-threaded one. This is mildly painful.

A new metric, subscription_send_queue_length, is also added. It is incremented by eval_updates_sequential before pushing into the new queue, and decremented by the send_worker after popping.

Because the metric is labeled with database_identity: Identity, the SubscriptionManager now needs to know its Identity, and so cannot be Default. Methods are added to make constructing one in tests easier.

API and ABI breaking changes

N/a

Expected complexity level and risk

3, some risk of deadlock due to added synchronization with the client map.

Testing

  • I believe automated testing is sufficient to validate semantics.
  • Would like a bot test to observe hopefully-improved performance.

and after releasing the datastore lock.

This commit moves some amount of work off the main `eval_updates_sequential` task
and into a new background worker, the `send_worker`.
The actual work performed is unchanged, it's just that some of it is moved to a different function
which runs asynchronously after releasing the datastore lock.
We expect two benefits from this:

1. Most obviously, we're now doing less work while holding the lock,
   reducing the size of our critical section.
2. We suspect that we had a thundering herd problem
   where we wake each client's WebSocket worker just before waking the next reducer task,
   making it wait in the Tokio task queue until all the client messages have been sent.
   This change should cause the reducer task to wake first, improving transaction throughput.

Moving the aggregation and broadcast to a separate task required
enclosing the `SubscriptionManager`'s client map in an `Arc<RwLock>`.
There's some amount of tricky code in there, with comments, that has to go out of its way
to avoid deadlocks.

I also chose to change the `table_name` to be an `Arc<str>` rather than a `Box<str>`,
as it needed to be cloned from `eval_updates_sequential` into the `send_worker`,
where previously it was just borrowed within `eval_updates_sequential`.
We could further reduce cloning
by threading the change from `Box` to `Arc` further through our code,
but I chose to keep the change relatively contained within the subscription-related code.

Additionally, while reworking the `SubscriptionManger`'s code,
I changed from using tuples to structs in a few places.
This leaked out a bit, and I changed some methods which used to take `(args: (T, U))`
to instead take  `arg_t: T, arg_u: U`.
The only callers outside of `module_subscription_manager.rs` (the ones I was changing anyways)
were constructing the tuple in the call anyways, and so now look cleaner.

Many tests are changed to now require a Tokio runtime, so that the `SubscriptionManager`
can `tokio::spawn` its `Self::send_worker` future.
Some tests which previously had a single-threaded runtime now require a multi-threaded one.
This is mildly painful.
@gefjon gefjon requested a review from Centril as a code owner May 26, 2025 13:44
@gefjon gefjon requested a review from joshua-spacetime May 26, 2025 13:45
@gefjon
Copy link
Contributor Author

gefjon commented May 26, 2025

@joshua-spacetime suggested that, rather than enclosing the ClientsMap in Arc<RwLock> and sharing it between the SubscriptionManager and the send_worker, we might instead Arc up the ClientInfos and clone them into the ComputedQueries message. At least in theory, that would be simpler and more maintainable. I took a glance, and found this to be more complicated than I wanted: parts of ClientInfo are mutable, but only the immutable (or interior-mutable) parts are used by the send_worker (namely the outbound_ref and the dropped), so this would involve separating the ClientInfo somehow into two distinct parts. That's more of a change than I want to make right now, especially given that I can't confidently say whether it would be slower or not.

@joshua-spacetime joshua-spacetime linked an issue May 27, 2025 that may be closed by this pull request
@Centril
Copy link
Contributor

Centril commented May 27, 2025

I also chose to change the table_name to be an Arc<str> rather than a Box<str>, as it needed to be cloned from eval_updates_sequential into the send_worker, where previously it was just borrowed within eval_updates_sequential. We could further reduce cloning
by threading the change from Box to Arc further through our code, but I chose to keep the change relatively contained within the subscription-related code.

Could you please newtype this to TableName so that we can be free to change the implementation in the future? We might want to use SSO and whatnot (see flexstr, ...). If possible, you could pull that out and we can review it quickly separately.

@bfops bfops added release-any To be landed in any release window performance A PR/Issue related to improving performance of stdb labels May 27, 2025
gefjon added a commit that referenced this pull request May 27, 2025
Per Mazdak's request on #2793
@joshua-spacetime joshua-spacetime added release-next and removed release-any To be landed in any release window labels May 27, 2025
@joshua-spacetime
Copy link
Collaborator

After:
Screenshot 2025-05-27 at 12 34 29 PM

Before:
Screenshot 2025-05-27 at 12 33 57 PM

@gefjon gefjon enabled auto-merge May 27, 2025 19:47
@gefjon gefjon disabled auto-merge May 27, 2025 19:47
This commit adds a new metric, `subscription_send_queue_length`.
It is incremented by `eval_updates_sequential` before pushing into the new queue,
and decremented by the `send_worker` after popping.

Because the metric is labeled with `database_identity: Identity`,
the `SubscriptionManager` now needs to know its `Identity`, and so cannot be `Default`.
Methods are added to make constructing one in tests easier.
@gefjon gefjon enabled auto-merge May 27, 2025 20:23
gefjon added 2 commits May 28, 2025 10:07
Incl.:
- Renaming `ClientQueryUpdate` to `SingleQueryUpdate` (not requested, but a better name),
  and moving it into `websocket.rs` (requested).
- Restoring uncurrying of `TableUpdate::new` and `TableUpdate::push`,
  which shifts the partial abstraction around in a way I find irrelevant
  but Mazdak seems to appreciate.
  Specifically, the aggregator loop now doesn't have to destructure its `SingleQueryUpdate`s,
  but `ExecutionUnit::eval` and `collect_table_update` have to construct them needlessly.
- Consolidating test constructors for `ModuleSubscriptions`.
- Various comments.
- Separating the body of the `send_worker`'s loop into a separate function.
…clockworklabs/SpacetimeDB into phoebe/subscription-manager-send-worker
@gefjon gefjon requested a review from Centril May 28, 2025 14:12
Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

@gefjon gefjon added this pull request to the merge queue May 28, 2025
Merged via the queue into master with commit 1e50c7d May 28, 2025
20 checks passed
@Centril Centril deleted the phoebe/subscription-manager-send-worker branch May 28, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance A PR/Issue related to improving performance of stdb release-next

Projects

None yet

Development

Successfully merging this pull request may close these issues.

eval_updates should not wake up each client connection task

6 participants