Skip to content

Commit 3278e35

Browse files
sjarmakclaude
andcommitted
refactor: migrate all sg-benchmarks references to sg-evals
Bulk rename sg-benchmarks → sg-evals across 719 files: - Dockerfiles (clone URLs, ENV vars, clone manifests) - Dockerfile.sg_only and Dockerfile.artifact_only - instruction_mcp.md (repo names in agent instructions) - task.toml (source_repo metadata) - oracle_answer.json and oracle_checks.py - scripts (mirror creation, generators, audits) - configs (instance_to_mirror.json, selection files) - docs (CONFIGS.md, MCP_UNIQUE_TASKS.md, etc.) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 88bed78 commit 3278e35

File tree

719 files changed

+3541
-2910
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

719 files changed

+3541
-2910
lines changed

agents/claude_baseline_agent.py

Lines changed: 31 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -354,9 +354,9 @@ def _get_repo_display(self) -> str:
354354
355355
Resolution order:
356356
1. SOURCEGRAPH_REPO_NAME env var (explicit override, highest priority)
357-
2. LOCOBENCH_PROJECT_ID -> sg-benchmarks/locobench-{prefix}
357+
2. LOCOBENCH_PROJECT_ID -> sg-evals/locobench-{prefix}
358358
(checked in host env AND _container_env_cache populated by setup())
359-
3. SWEBENCH_REPO_COMMIT -> sg-benchmarks/{repo_info}
359+
3. SWEBENCH_REPO_COMMIT -> sg-evals/{repo_info}
360360
(checked in host env AND _container_env_cache populated by setup())
361361
4. Fallback: "the codebase"
362362
"""
@@ -371,19 +371,19 @@ def _get_repo_display(self) -> str:
371371

372372
locobench_prefix = os.environ.get("LOCOBENCH_PROJECT_ID", "") or cache.get("LOCOBENCH_PROJECT_ID", "")
373373
if locobench_prefix:
374-
return f"sg-benchmarks/locobench-{locobench_prefix}"
374+
return f"sg-evals/locobench-{locobench_prefix}"
375375

376376
repo_info = os.environ.get("SWEBENCH_REPO_COMMIT", "") or cache.get("SWEBENCH_REPO_COMMIT", "")
377377
if repo_info:
378-
return f"sg-benchmarks/{repo_info}"
378+
return f"sg-evals/{repo_info}"
379379

380380
return "the codebase"
381381

382382
def _get_repo_list(self) -> list:
383-
"""Return list of sg-benchmarks repo names from SOURCEGRAPH_REPOS env var.
383+
"""Return list of sg-evals repo names from SOURCEGRAPH_REPOS env var.
384384
385385
Multi-repo MCP-unique tasks set SOURCEGRAPH_REPOS as a comma-separated
386-
list of sg-benchmarks mirror names (e.g. "sg-benchmarks/grafana,sg-benchmarks/grafana-loki").
386+
list of sg-evals mirror names (e.g. "sg-evals/grafana,sg-evals/grafana-loki").
387387
388388
Resolution order:
389389
1. Host env var SOURCEGRAPH_REPOS (set by config script)
@@ -437,9 +437,9 @@ def _rewrite_repo_references(text: str, sg_display: str) -> str:
437437
- **Repository**: org/repo (lang, ~NLOC)
438438
- **Repository:** org/repo
439439
- **Repo:** `org/repo`
440-
to reference the sg-benchmarks mirror, keeping the original as context.
440+
to reference the sg-evals mirror, keeping the original as context.
441441
"""
442-
if not sg_display or not sg_display.startswith("sg-benchmarks/"):
442+
if not sg_display or not sg_display.startswith("sg-evals/"):
443443
return text
444444

445445
sg_full = f"github.com/{sg_display}"
@@ -614,12 +614,12 @@ def create_run_agent_commands(self, instruction: str) -> list[ExecInput]:
614614
branch_instructions = (
615615
f"\n**Branch Search Instructions**\n\n"
616616
f"IMPORTANT: You must search the `{scip_branch}` branch for all "
617-
f"repositories in `github.com/sg-benchmarks/`.\n\n"
617+
f"repositories in `github.com/sg-evals/`.\n\n"
618618
f"When using search and file tools, always specify the "
619619
f"`{scip_branch}` branch:\n\n"
620620
f"- **keyword_search / nls_search:** Include "
621621
f"`rev:{scip_branch}` in your query alongside the repo filter\n"
622-
f' Example: `repo:^github\\.com/sg-benchmarks/REPO$ '
622+
f' Example: `repo:^github\\.com/sg-evals/REPO$ '
623623
f"rev:{scip_branch} YOUR_SEARCH_TERMS`\n"
624624
f"- **read_file / list_files:** Set the `revision` parameter "
625625
f'to `"{scip_branch}"`\n'
@@ -664,7 +664,7 @@ def create_run_agent_commands(self, instruction: str) -> list[ExecInput]:
664664
repo_scope=repo_scope, workflow_tail=workflow_tail
665665
)
666666
# Rewrite upstream repo references in instruction body to point
667-
# at the sg-benchmarks mirror so the agent sees consistent names.
667+
# at the sg-evals mirror so the agent sees consistent names.
668668
instruction = self._rewrite_repo_references(instruction, repo_display)
669669
instruction = self._inject_repo_context(
670670
instruction, repo_display, self._get_repo_list()
@@ -783,20 +783,20 @@ def create_run_agent_commands(self, instruction: str) -> list[ExecInput]:
783783
- ✅ CORRECT: "In {repo_display}, where is [code]?"
784784
- ✅ CORRECT: "Search {repo_display} for [query]"
785785
- ❌ WRONG: "In the codebase, where..." (too vague, might search wrong repo)
786-
- ❌ WRONG: "In navidrome, where..." (searches original, not sg-benchmarks mirror)
786+
- ❌ WRONG: "In navidrome, where..." (searches original, not sg-evals mirror)
787787
788788
IMPORTANT - SG-BENCHMARKS ORG:
789-
The Deep Search MCP is configured to search in the sg-benchmarks GitHub organization.
789+
The Deep Search MCP is configured to search in the sg-evals GitHub organization.
790790
This organization contains mirrors of all benchmark repositories with HEAD pinned to match your local working copy's commit.
791-
Do NOT search the original repositories - use sg-benchmarks which has the correct indexed commit.
791+
Do NOT search the original repositories - use sg-evals which has the correct indexed commit.
792792
793793
Workflow requirement:
794794
1) Run Deep Search MCP to find relevant code and understand relationships
795795
ALWAYS include repository reference: "In {repo_display}, [your query]"
796796
2) Open only the relevant files/regions needed to implement the fix
797797
3) If Deep Search returns no results, broaden the search query before opening more files
798798
799-
Deep Search is configured for sg-benchmarks org with the correct commit, so results should match your local working copy.
799+
Deep Search is configured for sg-evals org with the correct commit, so results should match your local working copy.
800800
801801
IMPORTANT: If your first search returns empty results, the repository name may differ
802802
from what you expect. Use `mcp__sourcegraph__sg_list_repos` (if available) to discover the correct repo name
@@ -841,10 +841,10 @@ def create_run_agent_commands(self, instruction: str) -> list[ExecInput]:
841841
- ✅ CORRECT: "In {repo_display}, where is [code]?"
842842
- ✅ CORRECT: "Search {repo_display} for [query]"
843843
- ❌ WRONG: "In the codebase, where..." (too vague, might search wrong repo)
844-
- ❌ WRONG: "In navidrome, where..." (searches original, not sg-benchmarks mirror)
844+
- ❌ WRONG: "In navidrome, where..." (searches original, not sg-evals mirror)
845845
846846
IMPORTANT - SG-BENCHMARKS ORG:
847-
The Deep Search MCP is configured to search in the sg-benchmarks GitHub organization.
847+
The Deep Search MCP is configured to search in the sg-evals GitHub organization.
848848
This organization contains mirrors of all benchmark repositories with HEAD pinned to match your local working copy's commit.
849849
Deep Search results should now match your local working copy without version mismatches.
850850
@@ -870,7 +870,7 @@ def create_run_agent_commands(self, instruction: str) -> list[ExecInput]:
870870
- Simple pattern/location verification → Local search
871871
872872
This hybrid approach gives you the semantic understanding of Deep Search plus the speed of local tools.
873-
With sg-benchmarks org configured, Deep Search results should accurately reflect your working copy.
873+
With sg-evals org configured, Deep Search results should accurately reflect your working copy.
874874
875875
IMPORTANT: If your first search returns empty results, the repository name may differ
876876
from what you expect. Use `mcp__sourcegraph__sg_list_repos` (if available) to discover the correct repo name
@@ -1718,19 +1718,19 @@ async def _setup_deepsearch_mcp(self, environment: BaseEnvironment) -> None:
17181718
repo_info = os.environ.get("SWEBENCH_REPO_COMMIT", "")
17191719
repo_name = ""
17201720
commit = ""
1721-
sg_benchmarks_org = "sg-benchmarks"
1721+
sg_benchmarks_org = "sg-evals"
17221722
if repo_info and "--" in repo_info:
17231723
repo_name, commit = repo_info.split("--", 1)
1724-
logger.info(f"BaselineClaudeCodeAgent: Deep Search MCP will use sg-benchmarks repo={repo_name}, commit={commit}")
1724+
logger.info(f"BaselineClaudeCodeAgent: Deep Search MCP will use sg-evals repo={repo_name}, commit={commit}")
17251725

1726-
# Deep Search MCP config - add sg-benchmarks org and repo info if available
1726+
# Deep Search MCP config - add sg-evals org and repo info if available
17271727
deepsearch_config = {
17281728
"type": "http",
17291729
"url": deepsearch_url,
17301730
"headers": {"Authorization": f"token {deepsearch_token}"},
17311731
}
17321732

1733-
# Add sg-benchmarks org and repo hint to the config if we have repo info
1733+
# Add sg-evals org and repo hint to the config if we have repo info
17341734
if repo_name:
17351735
deepsearch_config["org"] = sg_benchmarks_org
17361736
deepsearch_config["repo"] = repo_name
@@ -1861,19 +1861,19 @@ async def _setup_deepsearch_hybrid_mcp(self, environment: BaseEnvironment) -> No
18611861
repo_info = os.environ.get("SWEBENCH_REPO_COMMIT", "")
18621862
repo_name = ""
18631863
commit = ""
1864-
sg_benchmarks_org = "sg-benchmarks"
1864+
sg_benchmarks_org = "sg-evals"
18651865
if repo_info and "--" in repo_info:
18661866
repo_name, commit = repo_info.split("--", 1)
1867-
logger.info(f"BaselineClaudeCodeAgent: Hybrid Deep Search MCP will use sg-benchmarks repo={repo_name}, commit={commit}")
1867+
logger.info(f"BaselineClaudeCodeAgent: Hybrid Deep Search MCP will use sg-evals repo={repo_name}, commit={commit}")
18681868

1869-
# Deep Search MCP config (same as deepsearch mode) - add sg-benchmarks org and repo info if available
1869+
# Deep Search MCP config (same as deepsearch mode) - add sg-evals org and repo info if available
18701870
deepsearch_config = {
18711871
"type": "http",
18721872
"url": deepsearch_url,
18731873
"headers": {"Authorization": f"token {deepsearch_token}"},
18741874
}
18751875

1876-
# Add sg-benchmarks org and repo hint to the config if we have repo info
1876+
# Add sg-evals org and repo hint to the config if we have repo info
18771877
if repo_name:
18781878
deepsearch_config["org"] = sg_benchmarks_org
18791879
deepsearch_config["repo"] = repo_name
@@ -1896,7 +1896,7 @@ async def _setup_deepsearch_hybrid_mcp(self, environment: BaseEnvironment) -> No
18961896
await environment.upload_file(
18971897
source_path=mcp_config_path, target_path="/logs/agent/sessions/.mcp.json"
18981898
)
1899-
logger.info(f"BaselineClaudeCodeAgent: Hybrid Deep Search MCP configured at /logs/agent/sessions/ ({deepsearch_url}) with sg-benchmarks org")
1899+
logger.info(f"BaselineClaudeCodeAgent: Hybrid Deep Search MCP configured at /logs/agent/sessions/ ({deepsearch_url}) with sg-evals org")
19001900

19011901
# Get repo display name for CLAUDE.md
19021902
repo_display = self._get_repo_display()
@@ -1925,11 +1925,11 @@ async def _setup_deepsearch_hybrid_mcp(self, environment: BaseEnvironment) -> No
19251925
- ✅ CORRECT: "In {repo_display}, where is [code]?"
19261926
- ✅ CORRECT: "Search {repo_display} for [query]"
19271927
- ❌ WRONG: "In the codebase, where..." (too vague)
1928-
- ❌ WRONG: "In navidrome, where..." (searches original, not sg-benchmarks mirror)
1928+
- ❌ WRONG: "In navidrome, where..." (searches original, not sg-evals mirror)
19291929
19301930
## SG-BENCHMARKS ORG - COMMIT-MATCHED SEARCH
19311931
1932-
🎯 **Deep Search is configured to search within the **sg-benchmarks** GitHub organization.**
1932+
🎯 **Deep Search is configured to search within the **sg-evals** GitHub organization.**
19331933
19341934
This organization contains mirrors of all benchmark repositories with:
19351935
- HEAD pinned to the exact same commit as your local working copy
@@ -1940,7 +1940,7 @@ async def _setup_deepsearch_hybrid_mcp(self, environment: BaseEnvironment) -> No
19401940
19411941
Use this decision logic to pick the right tool:
19421942
1943-
### When to Use Deep Search MCP First (sg-benchmarks org):
1943+
### When to Use Deep Search MCP First (sg-evals org):
19441944
1. **Bug localization** - "In {repo_display}, where in the code does [error/behavior] occur?"
19451945
2. **Error path discovery** - "In {repo_display}, what code handles [specific error condition]?"
19461946
3. **Data flow tracing** - "In {repo_display}, how does data flow from [source] to [destination]?"
@@ -2052,7 +2052,7 @@ async def _setup_deepsearch_hybrid_mcp(self, environment: BaseEnvironment) -> No
20522052
20532053
## Why This Matters
20542054
2055-
With sg-benchmarks org correctly configured:
2055+
With sg-evals org correctly configured:
20562056
- Deep Search results are **accurate** (same commit as your code)
20572057
- Deep Search understands **relationships** (not just text matching)
20582058
- You have **full tool access** (hybrid: MCP + local tools)

agents/harnesses/base.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ def _prepare_instruction(self, instruction: str) -> str:
107107
parts.append(self.SG_TOOL_REFERENCE)
108108
elif mcp_type == "deepsearch":
109109
parts.append("## Deep Search Guidance")
110-
parts.append(f"Search `sg-benchmarks/{repo}` with Deep Search when you need cross-file understanding")
110+
parts.append(f"Search `sg-evals/{repo}` with Deep Search when you need cross-file understanding")
111111
elif mcp_type == "deepsearch_hybrid":
112112
parts.append("## Deep Search Hybrid Guidance")
113113
parts.append("Use Deep Search for semantic exploration and local tools for verification.")
@@ -124,11 +124,11 @@ def _get_repo_display(self) -> str:
124124
cache = self._container_env_cache
125125
locobench = cache.get("LOCOBENCH_PROJECT_ID") or os.environ.get("LOCOBENCH_PROJECT_ID", "")
126126
if locobench:
127-
return f"sg-benchmarks/locobench-{locobench}"
127+
return f"sg-evals/locobench-{locobench}"
128128

129129
swebench = cache.get("SWEBENCH_REPO_COMMIT") or os.environ.get("SWEBENCH_REPO_COMMIT", "")
130130
if swebench:
131-
return f"sg-benchmarks/{swebench}"
131+
return f"sg-evals/{swebench}"
132132

133133
return "the codebase"
134134

benchmarks/ccb_build/bustub-hyperloglog-impl-001/environment/Dockerfile.sg_only

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
FROM ghcr.io/theagentcompany/sde-implement-hyperloglog-image:1.0.0
66

7-
ENV SOURCEGRAPH_REPO_NAME=sg-benchmarks/bustub--d5f79431
7+
ENV SOURCEGRAPH_REPO_NAME=sg-evals/bustub--d5f79431
88

99
# TAC environment variables (needed by verifier)
1010
ENV TAC_SERVER_HOSTNAME=localhost
@@ -31,6 +31,6 @@ RUN git init 2>/dev/null || (git config --global init.defaultBranch main && git
3131
RUN touch /tmp/.sg_only_mode
3232

3333
# Clone manifest for sgonly_verifier_wrapper.sh to restore repo at verify time
34-
RUN echo '{"repos":[{"mirror":"sg-benchmarks/bustub--d5f79431","dest":"/workspace"}]}' > /tmp/.sg_only_clone_manifest.json
34+
RUN echo '{"repos":[{"mirror":"sg-evals/bustub--d5f79431","dest":"/workspace"}]}' > /tmp/.sg_only_clone_manifest.json
3535

3636
ENTRYPOINT []

benchmarks/ccb_build/bustub-hyperloglog-impl-001/instruction_mcp.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
**Local source files are not present.** Your workspace does not contain source code. You **MUST** use Sourcegraph MCP tools to discover, read, and understand code before making any changes.
44

5-
**Target Repository:** `github.com/sg-benchmarks/bustub--d5f79431`
6-
- Use `repo:^github.com/sg-benchmarks/bustub--d5f79431$` filter in keyword_search
7-
- Use `github.com/sg-benchmarks/bustub--d5f79431` as the `repo` parameter for go_to_definition/find_references/read_file
5+
**Target Repository:** `github.com/sg-evals/bustub--d5f79431`
6+
- Use `repo:^github.com/sg-evals/bustub--d5f79431$` filter in keyword_search
7+
- Use `github.com/sg-evals/bustub--d5f79431` as the `repo` parameter for go_to_definition/find_references/read_file
88

99

1010
## Required Workflow
@@ -67,7 +67,7 @@ If MCP search returns no results:
6767

6868
# Implement HyperLogLog Algorithm
6969

70-
**Repository:** github.com/sg-benchmarks/bustub--d5f79431 (mirror of bustub) (TheAgentCompany GitLab)
70+
**Repository:** github.com/sg-evals/bustub--d5f79431 (mirror of bustub) (TheAgentCompany GitLab)
7171
**Difficulty:** HARD
7272
**Category:** ccb_tac
7373
**Task Type:** Algorithm Implementation

benchmarks/ccb_build/camel-fix-protocol-feat-001/environment/Dockerfile.sg_only

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
FROM eclipse-temurin:17-jdk
66

7-
ENV SOURCEGRAPH_REPO_NAME=sg-benchmarks/camel--1006f047
7+
ENV SOURCEGRAPH_REPO_NAME=sg-evals/camel--1006f047
88

99
ENV DEBIAN_FRONTEND=noninteractive
1010

@@ -26,7 +26,7 @@ RUN git init && \
2626
RUN mkdir -p /logs/agent /logs/verifier
2727

2828
# Clone manifest for verifier (clone-at-verify strategy)
29-
RUN echo '{"workdir":"/workspace","repos":[{"mirror":"sg-benchmarks/camel--1006f047","target_dir":"."}]}' > /tmp/.sg_only_clone_manifest.json
29+
RUN echo '{"workdir":"/workspace","repos":[{"mirror":"sg-evals/camel--1006f047","target_dir":"."}]}' > /tmp/.sg_only_clone_manifest.json
3030

3131
# Mark sg_only mode
3232
RUN touch /tmp/.sg_only_mode

benchmarks/ccb_build/camel-fix-protocol-feat-001/instruction_mcp.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
**Local source files are not present.** Your workspace does not contain source code. You **MUST** use Sourcegraph MCP tools to discover, read, and understand code before making any changes.
44

5-
**Target Repository:** `github.com/sg-benchmarks/camel--1006f047`
6-
- Use `repo:^github.com/sg-benchmarks/camel--1006f047$` filter in keyword_search
7-
- Use `github.com/sg-benchmarks/camel--1006f047` as the `repo` parameter for go_to_definition/find_references/read_file
5+
**Target Repository:** `github.com/sg-evals/camel--1006f047`
6+
- Use `repo:^github.com/sg-evals/camel--1006f047$` filter in keyword_search
7+
- Use `github.com/sg-evals/camel--1006f047` as the `repo` parameter for go_to_definition/find_references/read_file
88

99

1010
## Required Workflow
@@ -111,7 +111,7 @@ Study existing components like `camel-kafka`, `camel-netty`, or `camel-amqp` for
111111

112112
## Context
113113

114-
- **Repository**: github.com/sg-benchmarks/camel--1006f047 (mirror of apache/camel) (Java, ~2M LOC)
114+
- **Repository**: github.com/sg-evals/camel--1006f047 (mirror of apache/camel) (Java, ~2M LOC)
115115
- **Category**: Feature Implementation
116116
- **Difficulty**: hard
117117
- **Subsystem Focus**: components/camel-fix/ (new module), components/pom.xml (registration)

benchmarks/ccb_build/cgen-deps-install-001/environment/Dockerfile.sg_only

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
FROM ubuntu:22.04
66

7-
ENV SOURCEGRAPH_REPO_NAME=sg-benchmarks/cgen--dibench
7+
ENV SOURCEGRAPH_REPO_NAME=sg-evals/cgen--dibench
88

99
ENV DEBIAN_FRONTEND=noninteractive
1010

@@ -25,7 +25,7 @@ RUN git init && \
2525
RUN mkdir -p /logs/agent /logs/verifier
2626

2727
# Clone manifest for verifier (clone-at-verify strategy)
28-
RUN echo '{"workdir":"/app/repo","repos":[{"mirror":"sg-benchmarks/cgen--dibench","target_dir":"."}]}' > /tmp/.sg_only_clone_manifest.json
28+
RUN echo '{"workdir":"/app/repo","repos":[{"mirror":"sg-evals/cgen--dibench","target_dir":"."}]}' > /tmp/.sg_only_clone_manifest.json
2929

3030
# Mark sg_only mode
3131
RUN touch /tmp/.sg_only_mode

benchmarks/ccb_build/cgen-deps-install-001/instruction_mcp.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
**Local source files are not present.** Your workspace does not contain source code. You **MUST** use Sourcegraph MCP tools to discover, read, and understand code before making any changes.
44

5-
**Target Repository:** `github.com/sg-benchmarks/cgen--dibench`
6-
- Use `repo:^github.com/sg-benchmarks/cgen--dibench$` filter in keyword_search
7-
- Use `github.com/sg-benchmarks/cgen--dibench` as the `repo` parameter for go_to_definition/find_references/read_file
5+
**Target Repository:** `github.com/sg-evals/cgen--dibench`
6+
- Use `repo:^github.com/sg-evals/cgen--dibench$` filter in keyword_search
7+
- Use `github.com/sg-evals/cgen--dibench` as the `repo` parameter for go_to_definition/find_references/read_file
88

99

1010
## Required Workflow
@@ -65,7 +65,7 @@ If MCP search returns no results:
6565

6666
---
6767

68-
**Sourcegraph Repository:** `github.com/sg-benchmarks/cgen--dibench`
68+
**Sourcegraph Repository:** `github.com/sg-evals/cgen--dibench`
6969

7070
# Dependency Inference Task
7171

benchmarks/ccb_build/codecoverage-deps-install-001/environment/Dockerfile.sg_only

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
FROM ubuntu:22.04
66

7-
ENV SOURCEGRAPH_REPO_NAME=sg-benchmarks/CodeCoverageSummary--dibench
7+
ENV SOURCEGRAPH_REPO_NAME=sg-evals/CodeCoverageSummary--dibench
88

99
ENV DEBIAN_FRONTEND=noninteractive
1010

@@ -25,7 +25,7 @@ RUN git init && \
2525
RUN mkdir -p /logs/agent /logs/verifier
2626

2727
# Clone manifest for verifier (clone-at-verify strategy)
28-
RUN echo '{"workdir":"/app/repo","repos":[{"mirror":"sg-benchmarks/CodeCoverageSummary--dibench","target_dir":"."}]}' > /tmp/.sg_only_clone_manifest.json
28+
RUN echo '{"workdir":"/app/repo","repos":[{"mirror":"sg-evals/CodeCoverageSummary--dibench","target_dir":"."}]}' > /tmp/.sg_only_clone_manifest.json
2929

3030
# Mark sg_only mode
3131
RUN touch /tmp/.sg_only_mode

0 commit comments

Comments
 (0)