Skip to content

[Bug] Ralph Loop tick() uses Promise.all — partial assignTasks failure corrupts task queue state #24

@noahwaldner

Description

@noahwaldner

Severity: High
Category: Logic Bug
File: src/ralph-loop.ts

Description

tick() runs checkTimeouts() and assignTasks() concurrently via Promise.all. assignTasks() iterates idle sessions and assigns tasks one by one in a loop. Each iteration mutates task state (task.assign(), session.assignTask()) before calling the async sendInput(). If assignTasks() throws partway through the loop, Promise.all re-throws immediately — leaving the queue in a partially-mutated state with no rollback.

Code

src/ralph-loop.ts:281–312:

private async tick(): Promise<void> {
  await Promise.all([
    this.checkTimeouts(),
    this.assignTasks(),   // ← if this throws mid-loop, earlier mutations are not rolled back
  ]);
  ...
}

private async assignTasks(): Promise<void> {
  const idleSessions = this.sessionManager.getIdleSessions();
  for (const session of idleSessions) {
    const task = this.taskQueue.next();  // ← task dequeued
    if (!task) break;
    await this.assignTaskToSession(task, session);
    // ↑ task.assign() + session.assignTask() happen inside here
    // if this throws on iteration 2, iteration 1's state mutations stand
  }
}

Concurrency Concern

checkTimeouts() and assignTasks() run simultaneously. A task that times out at the exact moment it is being assigned can be processed by both functions concurrently — checkTimeouts() calls task.fail() while assignTasks() is mid-way through task.assign(). The resulting task state is undefined.

Impact

After a partial failure:

  1. Some tasks are in_progress with sessions assigned — correct.
  2. Other tasks may be dequeued from the priority queue but never assigned — they are silently dropped.
  3. Sessions from successfully-processed iterations remain busy but may have no prompt sent (if the failure occurred before sendInput()).
  4. The task queue has no self-healing mechanism for this state — manual intervention or a full restart is required.

This is especially dangerous during high-frequency autonomous runs where many tasks are being assigned across many sessions simultaneously.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions