-
Notifications
You must be signed in to change notification settings - Fork 663
Fix the issues with scheduling procedures #3816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8fe3d6d to
a90fb3c
Compare
a90fb3c to
31a1c73
Compare
Centril
approved these changes
Dec 4, 2025
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, and I agree it should fix the deadlock. Just some small suggestions below.
Ludv1gL
added a commit
to Ludv1gL/SpacetimeDB
that referenced
this pull request
Jan 1, 2026
This fixes a regression introduced in commit afe169a ("Fix the issues with scheduling procedures clockworklabs#3816", Dec 5, 2025) where scheduled reducers stopped recording their ReducerContext to the commitlog. ## Problem When a scheduled reducer runs, the transaction was started with `Workload::Internal`, and the code path that would patch it to `Workload::Reducer(ReducerContext)` was removed during the refactor. This meant that: 1. No `inputs` section was written to the commitlog for scheduled reducers 2. Temporal queries could not extract timestamps from scheduled reducer commits, returning 0 for all `__system_time__` values 3. The reducer's arguments and caller info were not persisted ## Root Cause The old code in `module_host.rs::call_scheduled_reducer_inner` had: ```rust tx.ctx = ExecutionContext::with_workload( tx.ctx.database_identity(), Workload::Reducer(ReducerContext { ... }), ); ``` This patching was removed when the logic was moved to `scheduler.rs`. The new code in `call_reducer_with_tx` creates a ReducerContext but only uses it when `tx` is `None` - since the scheduler passes `Some(tx)`, the ReducerContext was never applied. ## Fix Restore the `tx.ctx` patching in `scheduler.rs` before calling `call_reducer_with_tx`, ensuring the ReducerContext is properly set for scheduled reducers just as it was before the refactor. ## Testing Verified with temporal-sensor-demo module: - Before fix: `SELECT * FROM sensor_data ALL` returned `__system_time__: 0` - After fix: Returns actual timestamps like `2025-12-21 01:42:51.113816` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Ludv1gL
added a commit
to Ludv1gL/SpacetimeDB
that referenced
this pull request
Jan 1, 2026
This fixes a regression introduced in commit afe169a ("Fix the issues with scheduling procedures clockworklabs#3816", Dec 5, 2025) where scheduled reducers stopped recording their ReducerContext to the commitlog. ## Problem When a scheduled reducer runs, the transaction was started with `Workload::Internal`, and the code path that would patch it to `Workload::Reducer(ReducerContext)` was removed during the refactor. This meant that: 1. No `inputs` section was written to the commitlog for scheduled reducers 2. The reducer's name, caller, timestamp, and arguments were not persisted ## Root Cause The old code in `module_host.rs::call_scheduled_reducer_inner` had: tx.ctx = ExecutionContext::with_workload( tx.ctx.database_identity(), Workload::Reducer(ReducerContext { ... }), ); This patching was removed when the logic was moved to `scheduler.rs`. The new code in `call_reducer_with_tx` creates a ReducerContext but only uses it when `tx` is `None` - since the scheduler passes `Some(tx)`, the ReducerContext was never applied. ## Fix Restore the `tx.ctx` patching in `scheduler.rs` before calling `call_reducer_with_tx`, ensuring the ReducerContext is properly set for scheduled reducers just as it was before the refactor.
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of Changes
This reapplies the patch from #3704, and fixes the issues that were causing it to deadlock.
The reason it was deadlocking was that it allowed for the following sequence of events:
SchedulerActor::handle_queued()begins mutable txModuleHost::disconnect_client()submits call tocall_reducer(tx: None)call_reducer(tx: Some)WasmModuleInstance::disconnect_clientnow has to try to take tx lock, but the scheduler's call_reducer already holds it and is behind it in the queueSo, I moved most of the logic from
handle_queuedback to being executed in the module worker thread, but kept the code inscheduler.rsso that it can all be reasoned about locally.Fixes #3645. Should I uncomment the implementation of
ExportFunctionForScheduledTable for F: Procedurenow?Expected complexity level and risk
2 - there's a chance that this patch hasn't fully fixed the deadlock issue from #3704, but I'm quite confident.
Testing
while true; do python -m smoketests schedule_reducer -k test_scheduled_table_subscription; donewould freeze up in only 2 or 3 iterations, but now it can run for 10 minutes without issues.