Skip to content
14 changes: 14 additions & 0 deletions .vscode/tasks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"version": "2.0.0",
"tasks": [
{
"label": "Run status panel (banner test)",
"type": "shell",
"command": "cargo run -- --config config.json --daemon",
"args": [],
"isBackground": false,
"problemMatcher": [],
"group": "build"
}
]
}
61 changes: 61 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,66 @@
# Changelog

## 2026-02-02
### Added - Container Exec & Server Resources Commands

#### New Stacker Commands (`commands/stacker.rs`)
- `ExecCommand` / `stacker.exec`: Execute commands inside running containers
- Parameters: deployment_hash, app_code, command, timeout (1-120s)
- **Security**: Blocks dangerous commands (rm -rf /, mkfs, dd if, shutdown, reboot, poweroff, halt, init 0/6, fork bombs)
- Case-insensitive pattern matching for security blocks
- Returns exit_code, stdout, stderr (output redacted for secrets)
- Comprehensive test suite with 27 security tests

- `ServerResourcesCommand` / `stacker.server_resources`: Collect server metrics
- Parameters: deployment_hash, include_disk, include_network, include_processes
- Uses MetricsCollector for CPU, memory, disk, network, and process info
- Returns structured JSON with system resource data

- `ListContainersCommand` / `stacker.list_containers`: List deployment containers
- Parameters: deployment_hash, include_health, include_logs, log_lines (1-1000)
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ListContainersCommand entry here documents log_lines as accepting a range of 1–1000, but the implementation in commands/stacker.rs clamps log_lines to 1–100 (self.log_lines = self.log_lines.clamp(1, 100)). To avoid confusion for API consumers, either update the documented range to 1–100 or relax the clamp in the code to match the stated 1–1000 range.

Suggested change
- Parameters: deployment_hash, include_health, include_logs, log_lines (1-1000)
- Parameters: deployment_hash, include_health, include_logs, log_lines (1-100)

Copilot uses AI. Check for mistakes.
- Returns container list with status, health info, and optional recent logs
Comment on lines +14 to +21
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description for ServerResourcesCommand lists an include_processes parameter and claims network and process info are returned, but the actual struct in commands/stacker.rs only has include_disk and include_network, and handle_server_resources currently exposes only CPU/memory/disk (with a placeholder network note and no process data). Please update this section either to match the implemented fields/metrics or extend the implementation to provide the documented behavior.

Copilot uses AI. Check for mistakes.

#### Docker Module Updates (`agent/docker.rs`)
- Added `exec_in_container_with_output()`: Execute commands and capture stdout/stderr separately
- Creates exec instance, starts with output capture
- Waits for completion and inspects exit code
- Returns structured (exit_code, stdout, stderr) tuple

#### Test Coverage
- `exec_command_security_tests`: 27 tests covering blocked commands, validation, timeout clamping
- `server_resources_command_tests`: 3 tests for parsing and validation
- `list_containers_command_tests`: 3 tests for parsing and log_lines clamping

## 2026-01-29
### Added - Unified Configuration Management Commands

#### New Stacker Commands (`commands/stacker.rs`)
- `FetchAllConfigs` / `stacker.fetch_all_configs`: Bulk fetch all app configs from Vault
- Parameters: deployment_hash, app_codes (optional - fetch all if empty), apply, archive
- Lists all available configs via Vault LIST operation
- Optionally writes all configs to disk
- Optionally creates tar.gz archive of all configs
- Returns detailed summary with fetched/applied counts

- `DeployWithConfigs` / `stacker.deploy_with_configs`: Unified config+deploy operation
- Parameters: deployment_hash, app_code, pull, force_recreate, apply_configs
- Fetches docker-compose.yml from Vault (_compose key) and app-specific .env
- Writes configs to disk before deployment
- Delegates to existing deploy_app handler for container orchestration
- Combines config and deploy results in single response

- `ConfigDiff` / `stacker.config_diff`: Detect configuration drift
- Parameters: deployment_hash, app_codes (optional), include_diff
- Compares SHA256 hashes of Vault configs vs deployed files
- Reports status: synced, drifted, or missing for each app
- Optionally includes line counts and content previews for drifted configs
- Summary with total/synced/drifted/missing counts

#### Command Infrastructure
- Added normalize/validate/with_command_context for all new commands
- Integrated all new commands into execute_with_docker dispatch
- Added test cases for command parsing

## 2026-01-23
### Added - Vault Configuration Management

Expand Down
3 changes: 3 additions & 0 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ services:
- .:/app
- /var/run/docker.sock:/var/run/docker.sock
- /data/encrypted:/data/encrypted
# Mount docker CLI from host for deploy_app/remove_app commands
- /usr/bin/docker:/usr/bin/docker:ro
- /usr/libexec/docker/cli-plugins:/usr/libexec/docker/cli-plugins:ro
env_file:
- .env
environment:
Expand Down
8 changes: 8 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ services:
volumes:
# Agent needs Docker socket for container monitoring and logs
- /var/run/docker.sock:/var/run/docker.sock
# Mount docker CLI from host for deploy_app/remove_app commands
- /usr/bin/docker:/usr/bin/docker:ro
- /usr/libexec/docker/cli-plugins:/usr/libexec/docker/cli-plugins:ro
# Mount host path for compose and config files
- /home/trydirect:/home/trydirect
env_file:
- .env
environment:
Expand All @@ -46,6 +51,9 @@ services:
volumes:
# Compose agent has exclusive Docker socket access for operations
- /var/run/docker.sock:/var/run/docker.sock
# Mount docker CLI from host for compose commands
- /usr/bin/docker:/usr/bin/docker:ro
- /usr/libexec/docker/cli-plugins:/usr/libexec/docker/cli-plugins:ro
- .:/app
env_file:
- .env
Expand Down
144 changes: 144 additions & 0 deletions docs/APP_DEPLOYMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
## Plan: App Configuration Deployment Strategy (Status Panel)

This plan outlines a robust, flexible approach for deploying app configurations using Status Panel, Vault, and external sources. It covers template sourcing, environment variable management, network integration, and extensibility for future needs.

---

## Vault Token Security Strategy (Selected Approach)

### Decision: Per-Deployment Scoped Tokens

Each Status Panel agent receives its own Vault token, scoped to only access that deployment's secrets. This provides:

| Security Property | How It's Achieved |
|-------------------|-------------------|
| **Tenant Isolation** | Each deployment has isolated Vault path: `{prefix}/{deployment_hash}/*` |
| **Blast Radius Limitation** | Compromised agent can only access its own deployment's secrets |
| **Revocation Granularity** | Individual deployments can be revoked without affecting others |
| **Audit Trail** | All Vault accesses are logged per-deployment for forensics |
| **Compliance** | Meets SOC2/ISO 27001 requirements for secret isolation |

### Vault Path Structure

```text
{VAULT_AGENT_PATH_PREFIX}/
└── {deployment_hash}/
├── status_panel_token # Agent authentication token (TTL: 30 days)
├── compose_agent_token # Docker Compose agent token
└── apps/
└── {app_code}/
├── _compose # docker-compose.yml (key: {app_code})
├── _env # .env file (key: {app_code}_env)
├── _configs # Bundled config files JSON array (key: {app_code}_configs)
└── _config # Legacy single config (key: {app_code}_config)
```

**Key Suffix Mapping** (used by VaultService):

| Suffix Pattern | Vault Key | Purpose |
|----------------|-----------|---------|
| `{app_code}` | `_compose` | Docker compose file |
| `{app_code}_env` | `_env` | Environment file (.env) |
| `{app_code}_configs` | `_configs` | Bundled config files (JSON array) |
| `{app_code}_config` | `_config` | Legacy single config file |

**Config Bundle Format** (`_configs` key):
```json
[
{
"name": "telegraf.conf",
"content": "[[inputs.cpu]]\n...",
"content_type": "text/plain",
"destination_path": "/etc/telegraf/telegraf.conf",
"file_mode": "0644",
"owner": "telegraf",
"group": "telegraf"
}
]
```

### Token Lifecycle

1. **Provisioning** (Install Service):
- During deployment, Install Service creates a new Vault token
- Token policy restricts access to `{prefix}/{deployment_hash}/*` only
- Token stored in Vault at `{prefix}/{deployment_hash}/status_panel_token`
- Token injected into Status Panel agent via environment variable

2. **Runtime** (Status Panel Agent):
- Agent reads `VAULT_TOKEN` from environment on startup
- All Vault API calls use this scoped token
- Token TTL: 30 days with auto-renewal capability

3. **Revocation** (On Deployment Destroy):
- Install Service deletes the deployment's Vault path recursively
- Token becomes invalid immediately
- All secrets for that deployment are removed

### Vault Policy Template

```hcl
# Policy: status-panel-{deployment_hash}
# Created by Install Service during deployment provisioning

path "{prefix}/{deployment_hash}/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}

# Deny access to other deployments (implicit, but explicit for clarity)
path "{prefix}/*" {
capabilities = ["deny"]
}
```

### Environment Variables (Status Panel Agent)

| Variable | Description | Example |
|----------|-------------|---------|
| `VAULT_ADDRESS` | Vault server URL | `https://vault.trydirect.io:8200` |
| `VAULT_TOKEN` | Per-deployment scoped token | (provisioned by Install Service) |
| `VAULT_AGENT_PATH_PREFIX` | KV mount/prefix | `status_panel` |

### Why NOT Shared Tokens?

| Approach | Risk | Decision |
|----------|------|----------|
| **Single Platform Token** | One compromised agent exposes ALL deployments | ❌ Rejected |
| **Per-Customer Token** | Compromises all of one customer's deployments | ❌ Rejected |
| **Per-Deployment Token** | Limits blast radius to single deployment | ✅ Selected |

---

### Steps

1. **Define Configuration Sources and Flow**
- Use Vault as the primary store for app configs and secrets.
- Support fetching templates/scripts from public GitHub or other package sources (zip, tar, etc.).
- Allow fallback to local or built-in templates if remote fetch fails.

2. **Template and Script Management**
- Store default templates in Vault or a managed repo.
- Allow user to specify a GitHub repo, branch, or path for custom templates/scripts.
- Support bash scripts for pre/post-deploy hooks (stored in Vault or fetched remotely).

3. **Network and Environment Integration**
- Parse and apply user-defined network settings (from Status Panel UI/API).
- Merge user-provided ENV key/values with defaults from templates, Vault, and app_vars.
- Validate and resolve conflicts, prioritizing user values.

4. **Deployment Execution**
- Download and render templates/scripts with merged variables.
- Apply network settings as defined by user (docker-compose, k8s, etc.).
- Run pre/post-deploy scripts if present.
- Log all actions and errors for auditability.

5. **Extensibility and Security**
- Support additional package managers (e.g., Helm, apt, pip) as plugins.
- Ensure all secrets/configs are encrypted at rest (Vault) and in transit.
- Allow for future integration with other config sources (S3, GCS, etc.).

### Further Considerations

1. Should template fetching support private repos (with token)?
2. How to handle versioning/rollback of configs and deployments?
3. Should we support dry-run/preview before applying changes?
79 changes: 79 additions & 0 deletions src/agent/docker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -618,3 +618,82 @@ pub async fn exec_in_container(name: &str, cmd: &str) -> Result<()> {
Err(anyhow::anyhow!("exec failed with code {}", exit_code))
}
}

/// Execute a shell command inside a running container and return output.
/// Returns (exit_code, stdout, stderr) tuple.
pub async fn exec_in_container_with_output(name: &str, cmd: &str) -> Result<(i64, String, String)> {
use bollard::exec::StartExecResults;
use futures_util::StreamExt;

let docker = docker_client()?;
let resolved_name = resolve_container_name(name)
.await
.unwrap_or_else(|_| name.to_string());

// Create exec instance
let exec = docker
.create_exec(
&resolved_name,
CreateExecOptions {
attach_stdout: Some(true),
attach_stderr: Some(true),
tty: Some(false),
cmd: Some(vec![
"/bin/sh".to_string(),
"-c".to_string(),
cmd.to_string(),
]),
..Default::default()
},
)
.await
.context("create exec")?;

// Start exec and capture output
let start = docker
.start_exec(&exec.id, None)
.await
.context("start exec")?;

let mut stdout = String::new();
let mut stderr = String::new();

match start {
StartExecResults::Detached => {
debug!(container = name, command = cmd, "exec detached");
}
StartExecResults::Attached { mut output, .. } => {
while let Some(item) = output.next().await {
match item {
Ok(LogOutput::StdOut { message }) => {
stdout.push_str(&String::from_utf8_lossy(&message));
}
Ok(LogOutput::StdErr { message }) => {
stderr.push_str(&String::from_utf8_lossy(&message));
}
Ok(LogOutput::Console { message }) => {
stdout.push_str(&String::from_utf8_lossy(&message));
}
Ok(LogOutput::StdIn { .. }) => {}
Err(e) => error!("exec output stream error: {}", e),
}
}
}
}

// Inspect exec to get exit code
let info = docker
.inspect_exec(&exec.id)
.await
.context("inspect exec")?;
let exit_code = info.exit_code.unwrap_or_default();

debug!(
container = name,
command = cmd,
exit_code,
"exec completed with output"
);

Ok((exit_code, stdout, stderr))
}
Loading