Skip to content

Conversation

@coolreader18
Copy link
Collaborator

@coolreader18 coolreader18 commented Dec 3, 2025

Description of Changes

This reapplies the patch from #3704, and fixes the issues that were causing it to deadlock.

The reason it was deadlocking was that it allowed for the following sequence of events:

  • SchedulerActor::handle_queued() begins mutable tx
  • ModuleHost::disconnect_client() submits call to call_reducer(tx: None)
  • scheduler submits call to call_reducer(tx: Some)
  • WasmModuleInstance::disconnect_client now has to try to take tx lock, but the scheduler's call_reducer already holds it and is behind it in the queue

So, I moved most of the logic from handle_queued back to being executed in the module worker thread, but kept the code in scheduler.rs so that it can all be reasoned about locally.

Fixes #3645. Should I uncomment the implementation of ExportFunctionForScheduledTable for F: Procedure now?

Expected complexity level and risk

2 - there's a chance that this patch hasn't fully fixed the deadlock issue from #3704, but I'm quite confident.

Testing

  • Manually verified that deadlock no longer occurs - previously, while true; do python -m smoketests schedule_reducer -k test_scheduled_table_subscription; done would freeze up in only 2 or 3 iterations, but now it can run for 10 minutes without issues.

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and I agree it should fix the deadlock. Just some small suggestions below.

@coolreader18 coolreader18 added this pull request to the merge queue Dec 5, 2025
Merged via the queue into master with commit afe169a Dec 5, 2025
64 of 68 checks passed
@coolreader18 coolreader18 deleted the noa/fix-scheduling branch December 8, 2025 18:41
Ludv1gL added a commit to Ludv1gL/SpacetimeDB that referenced this pull request Jan 1, 2026
This fixes a regression introduced in commit afe169a ("Fix the issues
with scheduling procedures clockworklabs#3816", Dec 5, 2025) where scheduled reducers
stopped recording their ReducerContext to the commitlog.

## Problem

When a scheduled reducer runs, the transaction was started with
`Workload::Internal`, and the code path that would patch it to
`Workload::Reducer(ReducerContext)` was removed during the refactor.

This meant that:
1. No `inputs` section was written to the commitlog for scheduled reducers
2. Temporal queries could not extract timestamps from scheduled reducer
   commits, returning 0 for all `__system_time__` values
3. The reducer's arguments and caller info were not persisted

## Root Cause

The old code in `module_host.rs::call_scheduled_reducer_inner` had:

```rust
tx.ctx = ExecutionContext::with_workload(
    tx.ctx.database_identity(),
    Workload::Reducer(ReducerContext { ... }),
);
```

This patching was removed when the logic was moved to `scheduler.rs`.
The new code in `call_reducer_with_tx` creates a ReducerContext but only
uses it when `tx` is `None` - since the scheduler passes `Some(tx)`,
the ReducerContext was never applied.

## Fix

Restore the `tx.ctx` patching in `scheduler.rs` before calling
`call_reducer_with_tx`, ensuring the ReducerContext is properly set
for scheduled reducers just as it was before the refactor.

## Testing

Verified with temporal-sensor-demo module:
- Before fix: `SELECT * FROM sensor_data ALL` returned `__system_time__: 0`
- After fix: Returns actual timestamps like `2025-12-21 01:42:51.113816`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Ludv1gL added a commit to Ludv1gL/SpacetimeDB that referenced this pull request Jan 1, 2026
This fixes a regression introduced in commit afe169a ("Fix the issues
with scheduling procedures clockworklabs#3816", Dec 5, 2025) where scheduled reducers
stopped recording their ReducerContext to the commitlog.

## Problem

When a scheduled reducer runs, the transaction was started with
`Workload::Internal`, and the code path that would patch it to
`Workload::Reducer(ReducerContext)` was removed during the refactor.

This meant that:
1. No `inputs` section was written to the commitlog for scheduled reducers
2. The reducer's name, caller, timestamp, and arguments were not persisted

## Root Cause

The old code in `module_host.rs::call_scheduled_reducer_inner` had:

    tx.ctx = ExecutionContext::with_workload(
        tx.ctx.database_identity(),
        Workload::Reducer(ReducerContext { ... }),
    );

This patching was removed when the logic was moved to `scheduler.rs`.
The new code in `call_reducer_with_tx` creates a ReducerContext but only
uses it when `tx` is `None` - since the scheduler passes `Some(tx)`,
the ReducerContext was never applied.

## Fix

Restore the `tx.ctx` patching in `scheduler.rs` before calling
`call_reducer_with_tx`, ensuring the ReducerContext is properly set
for scheduled reducers just as it was before the refactor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Procedures: enable executing scheduled procedures

3 participants