Skip to content

fix: benchmark timing broken on macOS#6

Open
hobostay wants to merge 1 commit into
aattaran:mainfrom
hobostay:fix/macos-benchmark-timing
Open

fix: benchmark timing broken on macOS#6
hobostay wants to merge 1 commit into
aattaran:mainfrom
hobostay:fix/macos-benchmark-timing

Conversation

@hobostay

@hobostay hobostay commented May 4, 2026

Copy link
Copy Markdown

Summary

  • Replaces date +%s%3N with python3 -c 'import time;print(int(time.time()*1000))' as the primary timing method in run_benchmark()

Problem

date +%s%3N is a GNU coreutils extension. macOS ships BSD date, which does not support %N. Crucially, BSD date does not fail — it outputs garbage like 1714841232%3N (literal %3N), so the || python3 ... fallback never triggers. The subsequent $((end_ms - start_ms)) arithmetic either produces wrong results or errors under set -e.

Test plan

  • On macOS: run ./deepclaude.sh --benchmark → should show reasonable ms timings
  • On Linux: run ./deepclaude.sh --benchmark → should still work (python3 is widely available)

🤖 Generated with Claude Code

`date +%s%3N` is a GNU coreutils extension not supported by BSD date on
macOS. On macOS it silently outputs garbage like `1714841232%3N` without
failing, so the python3 fallback never triggers and the elapsed time
calculation produces wrong results or arithmetic errors under `set -e`.

Use python3 as the primary method (available on both platforms since
macOS ships python3), with a seconds-only fallback.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Poiar added a commit to Poiar/defiant-claude that referenced this pull request Jun 11, 2026
…visibility

**1. Pre-execute web search/fetch before forwarding (aattaran#1):**
- Stripped Anthropic server-side tools (web_search, web_fetch, etc.)
  before forwarding to non-Anthropic providers like DeepSeek.
- populateToolResults still pre-executes pending tool calls locally
  and injects results into the conversation.
- Removed old 400→strip→retry workaround — no longer needed.

**2. --health flag (aattaran#2):**
- Both launchers: quick one-line proxy health check.
- Shows X/Y providers up, session spend, and which are down.

**3. Budget warning in health endpoint (aattaran#4):**
- DEEPCLAUDE_BUDGET_WARNING env var sets a spending threshold.
- Health snapshot shows budgetWarning at 50%, 75%, and 100% of cap.
- Statusline displays ⚠ when approaching or hitting the limit.

**4. Fallback visibility (aattaran#6):**
- recordFallback() in stats.ts tracks the most recent provider failover.
- Health endpoint exposes lastFallback (from/to/timestamp).
- Statusline shows ↳ indicator when failover happened in last 10 min.

**Tests:**
- Rewrote tool-strip tests (old 400 retry logic) → pre-execute+strip tests.
- 616 passing (was 617, one fewer test in the cleaner path).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Poiar added a commit to Poiar/defiant-claude that referenced this pull request Jun 12, 2026
…visibility

**1. Pre-execute web search/fetch before forwarding (aattaran#1):**
- Stripped Anthropic server-side tools (web_search, web_fetch, etc.)
  before forwarding to non-Anthropic providers like DeepSeek.
- populateToolResults still pre-executes pending tool calls locally
  and injects results into the conversation.
- Removed old 400→strip→retry workaround — no longer needed.

**2. --health flag (aattaran#2):**
- Both launchers: quick one-line proxy health check.
- Shows X/Y providers up, session spend, and which are down.

**3. Budget warning in health endpoint (aattaran#4):**
- DEEPCLAUDE_BUDGET_WARNING env var sets a spending threshold.
- Health snapshot shows budgetWarning at 50%, 75%, and 100% of cap.
- Statusline displays ⚠ when approaching or hitting the limit.

**4. Fallback visibility (aattaran#6):**
- recordFallback() in stats.ts tracks the most recent provider failover.
- Health endpoint exposes lastFallback (from/to/timestamp).
- Statusline shows ↳ indicator when failover happened in last 10 min.

**Tests:**
- Rewrote tool-strip tests (old 400 retry logic) → pre-execute+strip tests.
- 616 passing (was 617, one fewer test in the cleaner path).
Poiar added a commit to Poiar/defiant-claude that referenced this pull request Jun 12, 2026
…visibility

**1. Pre-execute web search/fetch before forwarding (aattaran#1):**
- Stripped Anthropic server-side tools (web_search, web_fetch, etc.)
  before forwarding to non-Anthropic providers like DeepSeek.
- populateToolResults still pre-executes pending tool calls locally
  and injects results into the conversation.
- Removed old 400→strip→retry workaround — no longer needed.

**2. --health flag (aattaran#2):**
- Both launchers: quick one-line proxy health check.
- Shows X/Y providers up, session spend, and which are down.

**3. Budget warning in health endpoint (aattaran#4):**
- DEEPCLAUDE_BUDGET_WARNING env var sets a spending threshold.
- Health snapshot shows budgetWarning at 50%, 75%, and 100% of cap.
- Statusline displays ⚠ when approaching or hitting the limit.

**4. Fallback visibility (aattaran#6):**
- recordFallback() in stats.ts tracks the most recent provider failover.
- Health endpoint exposes lastFallback (from/to/timestamp).
- Statusline shows ↳ indicator when failover happened in last 10 min.

**Tests:**
- Rewrote tool-strip tests (old 400 retry logic) → pre-execute+strip tests.
- 616 passing (was 617, one fewer test in the cleaner path).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant