Skip to content

feat: add universal hooks support#45

Open
NathanFlurry wants to merge 1 commit intomainfrom
feat/hooks
Open

feat: add universal hooks support#45
NathanFlurry wants to merge 1 commit intomainfrom
feat/hooks

Conversation

@NathanFlurry
Copy link
Member

Summary

Adds universal hooks support to sandbox-agent. Hooks are managed by sandbox-agent itself (not agent-native hooks) and work across all supported agents.

Hook Types

  • on_session_start — After session creation
  • on_session_end — When session terminates
  • on_message_start — Before message is processed
  • on_message_end — After message/response completes

API Changes

Extended CreateSessionRequest with:

  • hooks: Option<HooksConfig>
  • working_dir: Option<String>

HookDefinition

struct HookDefinition {
    command: String,
    timeout: Option<u64>,
    working_dir: Option<String>,
    continue_on_failure: Option<bool>,
}

Environment Variables

Hooks receive:

  • SANDBOX_SESSION_ID
  • SANDBOX_AGENT
  • SANDBOX_AGENT_MODE
  • SANDBOX_HOOK_TYPE
  • SANDBOX_MESSAGE (for message hooks)

Testing

  • 8 unit tests in hooks.rs
  • 9 integration tests with mock agent
  • Snapshot testing for all lifecycle hooks

Implements sandbox-agent-managed hooks that work universally across all agents
(Claude, Codex, OpenCode, Amp, Mock). Hooks are shell commands executed at
specific lifecycle points in a session.

## Hook Types
- on_session_start - Executed when a session is created
- on_session_end - Executed when a session terminates
- on_message_start - Executed before processing each message
- on_message_end - Executed after each message is fully processed

## Features
- Hooks are configured via the API when creating sessions
- Environment variables provide context (SANDBOX_SESSION_ID, SANDBOX_AGENT, etc.)
- Configurable timeout per hook
- continue_on_failure option to control execution flow
- Working directory support for hook execution

## API Changes
- CreateSessionRequest now accepts optional 'hooks' and 'workingDir' fields
- HooksConfig and HookDefinition schemas added to OpenAPI spec

## Testing
- 8 unit tests for hook execution
- 9 integration tests using mock agent with snapshot testing
- Tests cover all lifecycle hooks, multiple hooks, failure handling, and env vars
@railway-app
Copy link

railway-app bot commented Jan 30, 2026

🚅 Deployed to the sandbox-agent-pr-45 environment in sandbox-agent

Service Status Web Updated (UTC)
inspector 😴 Sleeping (View Logs) Web Jan 30, 2026 at 9:49 pm
website 😴 Sleeping (View Logs) Web Jan 30, 2026 at 9:49 pm

@claude
Copy link

claude bot commented Jan 30, 2026

Code Review

Found 5 issues that need attention:

CLAUDE.md Violations

1. TypeScript SDK not updated

The HTTP API was changed to add hooks (HooksConfig) and working_dir (String) fields to CreateSessionRequest:

/// Hooks configuration for lifecycle events.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub hooks: Option<HooksConfig>,
/// Working directory for hook execution.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub working_dir: Option<String>,
}

However, the TypeScript SDK was not updated in this PR. According to CLAUDE.md:

When changing the HTTP API, update the TypeScript SDK and CLI together.


2. CLI not updated to expose hooks and working_dir parameters

The HTTP API now accepts hooks and working_dir parameters in CreateSessionRequest, but the CLI command sandbox-agent api sessions create does not expose flags to set these values:

#[derive(Args, Debug)]
struct CreateSessionArgs {
session_id: String,
#[arg(long, short = 'a')]
agent: String,
#[arg(long, short = 'g')]
agent_mode: Option<String>,
#[arg(long, short = 'p')]
permission_mode: Option<String>,
#[arg(long, short = 'm')]
model: Option<String>,
#[arg(long, short = 'v')]
variant: Option<String>,
#[arg(long, short = 'A')]
agent_version: Option<String>,
#[command(flatten)]
client: ClientArgs,
}

These are currently hardcoded to None:

agent_version: args.agent_version.clone(),
hooks: None,
working_dir: None,
};

According to CLAUDE.md:

When changing the HTTP API, update the TypeScript SDK and CLI together.


3. docs/cli.mdx not updated

The CLI command structure was modified to include new hooks and working_dir fields in the request, but the documentation was not updated.

According to CLAUDE.md:

When adding or modifying CLI commands, update docs/cli.mdx to reflect the changes.


Bugs

4. Potential deadlock in stdout/stderr reading

In server/packages/sandbox-agent/src/hooks.rs:237-250, the code reads stdout and stderr sequentially in a blocking manner:

if let Some(ref mut handle) = child.stdout {
let mut buf = vec![0u8; MAX_OUTPUT_SIZE];
let n = handle.read(&mut buf).unwrap_or(0);
stdout = String::from_utf8_lossy(&buf[..n]).to_string();
}
if let Some(ref mut handle) = child.stderr {
let mut buf = vec![0u8; MAX_OUTPUT_SIZE];
let n = handle.read(&mut buf).unwrap_or(0);
stderr = String::from_utf8_lossy(&buf[..n]).to_string();
}
let status = child.wait()?;
Ok::<_, std::io::Error>((status, stdout, stderr))

This creates a classic pipe deadlock scenario:

  1. Read::read() on a pipe blocks until at least one byte is available or EOF
  2. If a child process writes extensively to stderr (filling the OS pipe buffer, typically ~64KB) before writing anything to stdout, the child blocks waiting for the parent to drain stderr
  3. Meanwhile, the parent blocks on stdout.read() waiting for data that never comes
  4. Both processes are deadlocked

Suggested fix: Use Command::output() or wait_with_output() which handles concurrent reading internally, or read both streams concurrently using threads/async.


5. Child process not killed on timeout (resource leak)

In server/packages/sandbox-agent/src/hooks.rs:320-337, when the timeout fires, the code only returns a timeout result but does not kill the child process:

}
}
Err(_) => {
let duration_ms = start.elapsed().as_millis() as u64;
warn!(
command = %command_for_result,
timeout_secs = %timeout_duration.as_secs(),
"Hook timed out"
);
HookResult {
command: command_for_result,
success: false,
exit_code: None,
stdout: String::new(),
stderr: format!("Hook timed out after {} seconds", timeout_duration.as_secs()),
duration_ms,
timed_out: true,
}
}
}

The problems:

  1. tokio::time::timeout() only drops the future when it fires, but spawn_blocking tasks cannot be cancelled and continue running to completion
  2. The Child process remains alive inside the blocking task
  3. In Rust, dropping a std::process::Child does NOT kill the child process - you must explicitly call child.kill()
  4. There is no cleanup code to kill the child on timeout

Result: A command like sleep 10 with a 1-second timeout will leave the sleep process running for the full 10 seconds as an orphan, causing a resource leak.

Suggested fix: Store the child process in a way that allows cleanup, or use a different approach that can properly terminate timed-out processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant