Skip to content

Task Execution Workflow and State Management #10

@starbops

Description

@starbops

Task Execution Workflow and State Management

User Story

As a user, I want reliable task execution with proper state tracking so that I can monitor my code execution progress and results.

Technical Requirements

  • Implement task state machine (pending → running → completed/failed)
  • Create execution queue with Redis
  • Build container lifecycle management
  • Implement timeout handling and graceful termination
  • Add execution result collection and storage
  • Create retry mechanisms for failed executions

Acceptance Criteria

  • Task state transitions follow defined state machine
  • Queue processes tasks in priority order
  • Timeout handling terminates long-running tasks
  • Execution results captured and stored correctly
  • Failed tasks can be retried with exponential backoff
  • Concurrent executions limited per user

Definition of Done

  • State machine implemented with proper validation
  • Queue processing functional with Redis
  • Timeout and cancellation working correctly
  • Execution results properly collected and stored
  • Retry logic tested under failure conditions
  • Performance tested with 100+ concurrent executions

Implementation Guide

Redis Queue Setup

# Install Redis client
go get github.com/go-redis/redis/v8

# Redis configuration for task queues
redis-cli CONFIG SET maxmemory-policy allkeys-lru
redis-cli CONFIG SET save "60 1000"

Task State Machine

type TaskStatus string

const (
    StatusPending   TaskStatus = "pending"
    StatusRunning   TaskStatus = "running"
    StatusCompleted TaskStatus = "completed"
    StatusFailed    TaskStatus = "failed"
    StatusCancelled TaskStatus = "cancelled"
    StatusTimeout   TaskStatus = "timeout"
)

// Valid state transitions
var validTransitions = map[TaskStatus][]TaskStatus{
    StatusPending:   {StatusRunning, StatusCancelled},
    StatusRunning:   {StatusCompleted, StatusFailed, StatusTimeout, StatusCancelled},
    StatusCompleted: {},
    StatusFailed:    {StatusPending}, // Allow retry
    StatusTimeout:   {StatusPending}, // Allow retry
    StatusCancelled: {},
}

Container Lifecycle Management

type ExecutionContext struct {
    TaskID      string
    ContainerID string
    StartTime   time.Time
    Timeout     time.Duration
    CancelChan  chan struct{}
    ResultChan  chan ExecutionResult
}

func (e *ExecutionEngine) ExecuteTask(ctx context.Context, task *Task) error {
    // Create container
    container, err := e.createContainer(task)
    if err != nil {
        return fmt.Errorf("failed to create container: %w", err)
    }
    
    // Start execution with timeout
    execCtx := &ExecutionContext{
        TaskID:     task.ID,
        Timeout:    time.Duration(task.TimeoutSeconds) * time.Second,
        CancelChan: make(chan struct{}),
        ResultChan: make(chan ExecutionResult, 1),
    }
    
    return e.runWithTimeout(ctx, execCtx, container)
}

Queue Processing

  • Priority-based task scheduling
  • Worker pool with configurable concurrency
  • Exponential backoff for retry logic
  • Dead letter queue for permanently failed tasks
  • Health checks for queue workers

Execution Results Collection

  • Capture stdout/stderr streams
  • Record execution time and resource usage
  • Store exit codes and error messages
  • Collect artifacts and logs
  • Update task status in database

Related Epic

Contributes to Epic #8: Container Execution Engine

Metadata

Metadata

Assignees

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions