Skip to content

Commit 01c5ce8

Browse files
sjarmakclaude
andcommitted
Assorted fixes: clone-as-claude, test permissions, audit script
- Servo scrollend Dockerfiles: clone-as-claude pattern (create claude user before git clone, USER claude during clone, USER root after) - Fix test.sh permissions (644→755) for envoy-dfp, envoy-udp, terraform - Make TIMEOUT_MULTIPLIER overridable via env var in run_selected_tasks.sh - Update audit_official_scores.py with improved config discovery and crash_failure detection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 46f2b0e commit 01c5ce8

File tree

7 files changed

+408
-10
lines changed

7 files changed

+408
-10
lines changed

benchmarks/ccb_build/servo-scrollend-event-feat-001/environment/Dockerfile

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,19 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
1313
pkg-config \
1414
&& rm -rf /var/lib/apt/lists/*
1515

16+
# Create claude user and writable work dirs before clone.
17+
RUN (adduser --disabled-password --gecos '' claude 2>/dev/null || true) && \
18+
mkdir -p /workspace /logs && \
19+
chown -R claude:claude /workspace /logs
20+
1621
# Clone the actual Servo repository at pinned commit
1722
# This ensures BOTH baseline and MCP agents have identical file access
1823
# Note: Servo is a large Rust codebase (~1.6GB)
24+
USER claude
1925
RUN git clone --depth 1 https://github.com/sg-evals/servo--be6a2f99.git . && \
2026
git config user.email "agent@example.com" && \
2127
git config user.name "Agent"
28+
USER root
2229

2330
# Task setup complete
2431
# Note: Both baseline and MCP agents now have access to the real Servo source.

benchmarks/ccb_build/servo-scrollend-event-feat-001/environment/Dockerfile.artifact_only

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,20 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
1818
pkg-config \
1919
&& rm -rf /var/lib/apt/lists/*
2020

21+
# Create claude user and writable work dirs before clone.
22+
RUN (adduser --disabled-password --gecos '' claude 2>/dev/null || true) && \
23+
mkdir -p /workspace /logs && \
24+
chown -R claude:claude /workspace /logs
25+
2126
# Clone the actual Servo repository at pinned commit
2227
# This ensures BOTH baseline and MCP agents have identical file access
2328
# Note: Servo is a large Rust codebase (~1.6GB)
29+
USER claude
2430
RUN git clone --filter=blob:none --no-checkout https://github.com/servo/servo.git . && \
2531
git checkout be6a2f99a1e80060228f41280fd7d2178983e7ed && \
2632
git config user.email "agent@example.com" && \
2733
git config user.name "Agent"
34+
USER root
2835

2936
# Task setup complete
3037
# Note: Both baseline and MCP agents now have access to the real Servo source.

benchmarks/ccb_fix/envoy-dfp-host-leak-fix-001/tests/test.sh

100644100755
File mode changed.

benchmarks/ccb_fix/envoy-udp-proxy-cds-fix-001/tests/test.sh

100644100755
File mode changed.

benchmarks/ccb_fix/terraform-plan-null-unknown-fix-001/tests/test.sh

100644100755
File mode changed.

configs/run_selected_tasks.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ USE_CASE_CATEGORY_FILTER=""
5454
MODEL="${MODEL:-anthropic/claude-haiku-4-5-20251001}"
5555
CONCURRENCY=1 # harbor -n: trials per task
5656
PARALLEL_TASKS=0 # 0 = auto-detect from accounts; overridden by --parallel N
57-
TIMEOUT_MULTIPLIER=10
57+
TIMEOUT_MULTIPLIER="${TIMEOUT_MULTIPLIER:-10}"
5858
RUN_BASELINE=true
5959
RUN_FULL=true
6060
CATEGORY="${CATEGORY:-staging}"

0 commit comments

Comments
 (0)