Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions .claude/prompts/nl-unity-suite.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# CLAUDE TASK: Run NL/T editing tests for Unity MCP repo and emit JUnit

You are running in CI at the repository root. Use only the tools that are allowed by the workflow:
- View, GlobTool, GrepTool for reading.
- Bash for local shell (git is allowed).
- BatchTool for grouping.
- MCP tools from server "unity" (exposed as mcp__unity__*).

## Test target
- Primary file: `ClaudeTests/longUnityScript-claudeTest.cs`
- For each operation, prefer structured edit tools (`replace_method`, `insert_method`, `delete_method`, `anchor_insert`, `apply_text_edits`, `regex_replace`) via the MCP server.
- Include `precondition_sha256` for any text path write.

## Output requirements
- Create a JUnit XML at `reports/claude-nl-tests.xml`.
- Each test = one `<testcase>` with `classname="UnityMCP.NL"` or `UnityMCP.T`.
- On failure, include a `<failure>` node with a concise message and the last evidence snippet (10–20 lines).
- Also write a human summary at `reports/claude-nl-tests.md` with checkboxes and the windowed reads.

## Safety & hygiene
- Make edits in-place, then revert them at the end (`git stash -u`/`git reset --hard` or balanced counter-edits) so the workspace is clean for subsequent steps.
- Never push commits from CI.
- If a write fails midway, ensure the file is restored before proceeding.

## NL-0. Sanity Reads (windowed)
- Tail 120 lines of `ClaudeTests/longUnityScript-claudeTest.cs`.
- Show 40 lines around method `Update`.
- **Pass** if both windows render with expected anchors present.

## NL-1. Method replace/insert/delete (natural-language)
- Replace `HasTarget` with block-bodied version returning `currentTarget != null`.
- Insert `PrintSeries()` after `GetCurrentTarget` logging `1,2,3`.
- Verify by reading 20 lines around the anchor.
- Delete `PrintSeries()` and verify removal.
- **Pass** if diffs match and verification windows show expected content.

## NL-2. Anchor comment insertion
- Add a comment `Build marker OK` immediately above the `Update` method.
- **Pass** if the comment appears directly above the `public void Update()` line.

## NL-3. End-of-class insertion
- Insert a 3-line comment `Tail test A/B/C` before the last method or immediately before the final class brace (preview, then apply).
- **Pass** if windowed read shows the three lines at the intended location.

## NL-4. Compile trigger
- After any NL edit, ensure no stale compiler errors:
- Write a short marker edit, then **revert** after validating.
- The CI job will run Unity compile separately; record your local check (e.g., file parity and syntax sanity) as INFO, but do not attempt to invoke Unity here.

## T-A. Anchor insert (text path)
- Insert after `GetCurrentTarget`: `private int __TempHelper(int a, int b) => a + b;`
- Verify via read; then delete with a `regex_replace` targeting only that helper block.
- **Pass** if round-trip leaves the file exactly as before.

## T-B. Replace method body with minimal range
- Identify `HasTarget` body lines; single `replace_range` to change only inside braces; then revert.
- **Pass** on exact-range change + revert.

## T-C. Header/region preservation
- For `ApplyBlend`, change only interior lines via `replace_range`; the method signature and surrounding `#region`/`#endregion` markers must remain untouched.
- **Pass** if signature and region markers unchanged.

## T-D. End-of-class insertion (anchor)
- Find final class brace; `position: before` to append a temporary helper; then remove.
- **Pass** if insert/remove verified.

## T-E. Temporary method lifecycle
- Insert helper (T-A), update helper implementation via `apply_text_edits`, then delete with `regex_replace`.
- **Pass** if lifecycle completes and file returns to original checksum.

## T-F. Multi-edit atomic batch
- In one call, perform two `replace_range` tweaks and one comment insert at the class end; verify all-or-nothing behavior.
- **Pass** if either all 3 apply or none.

## T-G. Path normalization
- Run the same edit once with `unity://path/ClaudeTests/longUnityScript-claudeTest.cs` and once with `ClaudeTests/longUnityScript-claudeTest.cs` (if supported).
- **Pass** if both target the same file and no path duplication.

## T-H. Validation levels
- After edits, run `validate` with `level: "standard"`, then `"basic"` for temporarily unbalanced text ops; final state must be valid.
- **Pass** if validation OK and final file compiles in CI step.

## T-I. Failure surfaces (expected)
- Too large payload: `apply_text_edits` with >15 KB aggregate → expect `{status:"too_large"}`.
- Stale file: change externally, then resend with old `precondition_sha256` → expect `{status:"stale_file"}` with hashes.
- Overlap: two overlapping ranges → expect rejection.
- Unbalanced braces: remove a closing `}` → expect validation failure and **no write**.
- Header guard: attempt insert before the first `using` → expect `{status:"header_guard"}`.
- Anchor aliasing: `insert`/`content` alias → expect success (aliased to `text`).
- Auto-upgrade: try a text edit overwriting a method header → prefer structured `replace_method` or return a clear error.
- **Pass** when each negative case returns the expected failure without persisting changes.

## T-J. Idempotency & no-op
- Re-run the same `replace_range` with identical content → expect success with no change.
- Re-run a delete of an already-removed helper via `regex_replace` → clean no-op.
- **Pass** if both behave idempotently.

### Implementation notes
- Always capture pre- and post‑windows (±20–40 lines) as evidence in the JUnit `<failure>` or as `<system-out>`.
- For any file write, include `precondition_sha256` and verify the post‑hash in your log.
- At the end, restore the repository to its original state (`git status` must be clean).

# Emit the JUnit file to reports/claude-nl-tests.xml and a summary markdown to reports/claude-nl-tests.md.
128 changes: 128 additions & 0 deletions .github/workflows/claude-nl-suite.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
name: Claude NL suite + (optional) Unity compile

on:
workflow_dispatch: {}

permissions:
contents: write # allow Claude to write test artifacts
pull-requests: write # allow annotations / comments
issues: write

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
nl-suite:
if: github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
timeout-minutes: 60

steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

# Python + uv for the Unity MCP server
- name: Install Python + uv
uses: astral-sh/setup-uv@v4
with:
python-version: '3.11'

- name: Install UnityMcpServer deps
run: |
set -eux
if [ -f "UnityMcpBridge/UnityMcpServer~/src/pyproject.toml" ]; then
uv pip install -e "UnityMcpBridge/UnityMcpServer~/src"
elif [ -f "UnityMcpBridge/UnityMcpServer~/src/requirements.txt" ]; then
uv pip install -r "UnityMcpBridge/UnityMcpServer~/src/requirements.txt"
else
echo "No Python deps found for UnityMcpServer~ (skipping)"
fi

- name: Run Claude NL/T test suite
id: claude
uses: anthropics/claude-code-base-action@beta
with:
# Test instructions live here
prompt_file: .claude/prompts/nl-unity-suite.md

# Tight tool allowlist
allowed_tools: "Bash(git:*),View,GlobTool,GrepTool,BatchTool,mcp__unity__*"

# MCP server path (matches your screenshots)
mcp_config: |
{
"mcpServers": {
"unity": {
"command": "python",
"args": ["UnityMcpBridge/UnityMcpServer~/src/server.py"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Path 'UnityMcpBridge/UnityMcpServer~/src/server.py' does not exist in the repository. The MCP server path needs to be corrected or the server files need to be added.

}
}
}

# Guardrails
model: "claude-3-7-sonnet-20250219"
max_turns: "10"
timeout_minutes: "20"
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}

- name: Upload JUnit (Claude NL/T)
if: always()
uses: actions/upload-artifact@v4
with:
name: claude-nl-tests
path: reports/claude-nl-tests.xml
if-no-files-found: ignore

- name: Annotate PR with test results (Claude NL/T)
if: always()
uses: dorny/test-reporter@v1
with:
name: Claude NL/T
path: reports/claude-nl-tests.xml
reporter: java-junit
fail-on-empty: false

# Detect secrets + project/package mode WITHOUT using secrets in `if:`
- name: Detect Unity mode & secrets
id: detect
env:
UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
run: |
if [ -n "$UNITY_LICENSE" ]; then echo "has_license=true" >> "$GITHUB_OUTPUT"; else echo "has_license=false" >> "$GITHUB_OUTPUT"; fi
if [ -f "ProjectSettings/ProjectVersion.txt" ]; then echo "is_project=true" >> "$GITHUB_OUTPUT"; else echo "is_project=false" >> "$GITHUB_OUTPUT"; fi
if [ -f "Packages/manifest.json" ] && [ ! -f "ProjectSettings/ProjectVersion.txt" ]; then echo "is_package=true" >> "$GITHUB_OUTPUT"; else echo "is_package=false" >> "$GITHUB_OUTPUT"; fi

# --- Optional: Unity compile after Claude’s edits (satisfies NL-4) ---
- name: Unity compile (Project)
if: always() && steps.detect.outputs.has_license == 'true' && steps.detect.outputs.is_project == 'true'
uses: game-ci/unity-test-runner@v4
env:
UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
with:
projectPath: .
githubToken: ${{ secrets.GITHUB_TOKEN }}
testMode: EditMode

- name: Unity compile (Package)
if: always() && steps.detect.outputs.has_license == 'true' && steps.detect.outputs.is_package == 'true'
uses: game-ci/unity-test-runner@v4
env:
UNITY_LICENSE: ${{ secrets.UNITY_LICENSE }}
UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
with:
packageMode: true
unityVersion: 2022.3.45f1 # set your exact version
projectPath: .
githubToken: ${{ secrets.GITHUB_TOKEN }}

- name: Clean working tree (discard temp edits)
if: always()
run: |
git restore -SW :/
git clean -fd
Loading