Skip to content

🏥 CI Failureexamples test workflow: all curl commands fail with ssl connection error after pr #524 #525

@github-actions

Description

@github-actions

Summary

The Examples Test workflow is experiencing a complete failure where all curl-based tests fail with exit code 35 (CURLE_SSL_CONNECT_ERROR) after merging PR #524.

Failed Run Details

Failure Pattern

ALL curl tests failing with exit code 35:

  1. ✅ basic-curl.sh setup - containers start successfully
  2. ❌ basic-curl.sh execution - curl -s https://api.github.com → exit code 35
  3. ✅ using-domains-file.sh setup - containers start successfully
  4. ❌ using-domains-file.sh execution - curl -s https://api.github.com → exit code 35
  5. ✅ debugging.sh setup - containers start successfully
  6. ❌ debugging.sh execution - curl -s https://api.github.com/zen → exit code 35
  7. ⏭️ blocked-domains.sh - skipped due to previous failures

Key Observations

✅ What Works

  • Container creation and startup
  • Squid healthcheck passes
  • iptables rules applied successfully
  • Network configuration (172.30.0.0/24)
  • Docker Compose orchestration

❌ What Fails

  • curl SSL/TLS connection to HTTPS endpoints
  • 100% failure rate across all example tests
  • Deterministic failure (not intermittent)

🔍 Container Logs Analysis

From the debugging.sh test logs:

[entrypoint] Proxy configuration:
[entrypoint]   HTTP_PROXY=
[entrypoint]   HTTPS_PROXY=
[entrypoint] Network information:
[entrypoint]   IP address: 172.30.0.20 
[entrypoint]   Hostname: 83c0d2a68e4f
[entrypoint] Executing command: /bin/bash -c curl -s https://api.github.com/zen

[DEBUG] Agent exit code: 35
``````

## Root Cause Analysis

### Curl Exit Code 35 Meaning

From curl documentation:
``````
CURLE_SSL_CONNECT_ERROR (35)
  A problem occurred somewhere in the SSL/TLS handshake.
  You really want the error buffer and read the message there
  as it pinpoints the problem slightly more.
``````

### PR #524 Changes

The merged PR removed HTTP_PROXY/HTTPS_PROXY environment variables:

**Rationale from PR**:
- "Intercept mode (iptables DNAT 80/443 → squid:3129) handles all routing transparently"
- "Port 3128 is unreachable from the agent container, causing Codex (Rust/reqwest) to fail"

**Changes made**:
1. Removed `HTTP_PROXY` and `HTTPS_PROXY` from agent container environment
2. Added proxy vars to `EXCLUDED_ENV_VARS` to prevent leaking via `--env-all`
3. Updated entrypoint.sh logging to show empty proxy vars

### Hypothesis

The SSL connection failure suggests one of these issues:

1. **iptables DNAT not working correctly**: Traffic to port 443 may not be redirecting to Squid's intercept port (3129)
2. **Squid intercept mode misconfiguration**: Squid may not be properly handling intercepted HTTPS (CONNECT) traffic
3. **Certificate verification issue**: curl in the agent container may not trust the connection without explicit proxy env vars
4. **Squid → External SSL handshake failure**: Squid may be failing to establish the outbound SSL connection

## Evidence Timeline

``````
20:14:47.6048977Z  Container awf-agent  Started
20:14:47.6247851Z [entrypoint] Agentic Workflow Firewall - Agent Container
20:14:47.6251074Z [iptables] NOTE: Host-level DOCKER-USER chain handles egress filtering
20:14:47.6252390Z [iptables] Squid proxy: squid-proxy:3128 (intercept: 3129)
20:14:47.6733845Z [iptables] Redirect HTTP (80) and HTTPS (443) to Squid intercept port...
20:14:47.6906418Z [iptables] NAT rules applied successfully
20:14:47.7058208Z [entrypoint] Executing command: /bin/bash -c curl -s https://api.github.com/zen
20:14:47.8345512Z [DEBUG] Agent exit code: 35

Only 0.1 seconds between command execution and failure - suggests immediate SSL handshake failure.

Impact

  • 🔴 CRITICAL - All example tests blocked
  • Cannot verify basic firewall functionality
  • CI pipeline broken for main branch
  • User-facing examples do not work

Recommended Investigation Steps

  1. Check Squid access logs for connection attempts:

    sudo cat /tmp/squid-logs-1770322481586/access.log
  2. Verify iptables DNAT rules are redirecting 443 traffic:

    docker exec awf-agent iptables -t nat -L -n -v
  3. Test curl with verbose output to see SSL handshake details:

    sudo awf --allow-domains api.github.com --keep-containers -- curl -v https://api.github.com/zen
  4. Compare with pre-PR fix: remove HTTP_PROXY/HTTPS_PROXY env vars from agent container #524 behavior: Check if containers built from parent commit (769a6f5) work correctly

  5. Verify Squid intercept port configuration:

    docker exec awf-squid grep -E '(http_port|ssl_bump)' /etc/squid/squid.conf

Related Issues

Files to Review

  • containers/agent/setup-iptables.sh - NAT redirection rules
  • src/docker-manager.ts - Container environment configuration
  • src/squid-config.ts - Squid proxy configuration
  • containers/squid/squid.conf - Squid intercept port setup

🏥 Automatically investigated by CI Doctor

AI generated by CI Doctor

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingci

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions