CI: add Claude NL suite workflow, prompt, and long Unity test script … #30

dsarno · 2025-08-19T21:04:57Z

…on default branch for manual runs. This will ultimately become the suite of Claude Desktop test prompts that will talk to the Unity MCP, and to headless Unity, to validate MCP functionality.

Summary by CodeRabbit

New Features
- Added a comprehensive Unity test script to simulate complex interaction and animator behaviors for validation.
Documentation
- Added a guide describing the NL/T editing test suite, phases, tools, outputs, safety, and acceptance criteria.
Tests
- Introduced a natural‑language editing test suite that performs structured edits, validates results, and generates JUnit and Markdown reports.
CI
- Added a manually triggered workflow to run the suite, upload reports, and annotate PRs. Detects Unity projects/packages and optionally compiles with Unity. Ensures environment cleanup after runs.

…on default branch for manual runs

coderabbitai · 2025-08-19T21:05:05Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a new Claude NL/T test prompt, a GitHub Actions workflow to run the suite and optionally compile Unity projects/packages, and a large standalone Unity MonoBehaviour script used as the primary NL test target.

Changes

Cohort / File(s)	Summary
NL/T Prompt & Spec `.claude/prompts/nl-unity-suite.md`	Introduces a detailed prompt/spec for running NL/T editing tests via MCP, defining tools, phases, validations, outputs (JUnit XML, Markdown), safety constraints, and implementation notes.
CI Workflow: NL Suite + Optional Unity Compile `.github/workflows/claude-nl-suite.yml`	Adds a workflow (manual trigger) to run Claude NL/T tests via claude-code-base-action with Unity MCP server, upload JUnit, annotate PRs, detect Unity mode, optionally run Unity test runner (project/package), and clean the repo.
Unity Test Target Script `ClaudeTests/longUnityScript-claudeTest.cs`	Adds LongUnityScriptClaudeTest MonoBehaviour with target selection/switching, animator parameter updates, held-object tracking, blend computation, and extensive padded methods for length. Public fields and methods exposed for tests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant GH as GitHub Actions
  participant Repo as Repository
  participant MCP as Unity MCP Server
  participant Claude as Claude Code Base Action
  participant Reporter as Test Reporter
  participant Unity as Unity Test Runner

  GH->>Repo: checkout (fetch-depth: 0)
  GH->>GH: setup Python/uv
  GH->>MCP: install deps (pyproject/requirements)
  GH->>Claude: run NL/T suite (prompt_file, tools, model, max_turns)
  Claude->>MCP: structured edit ops (replace/insert/delete/anchor/edits)
  MCP-->>Claude: results + evidence
  Claude->>Repo: write reports (JUnit XML, Markdown)
  GH->>GH: upload artifact (JUnit)
  GH->>Reporter: annotate PR (java-junit)
  GH->>GH: detect Unity mode (license, project/package)
  alt Project mode
    GH->>Unity: run EditMode tests (projectPath .)
  else Package mode
    GH->>Unity: run package compile (packageMode true)
  end
  GH->>Repo: clean working tree (restore + clean)

sequenceDiagram
  autonumber
  participant UnityLoop as Unity Engine
  participant Script as LongUnityScriptClaudeTest
  participant Anim as Animator

  UnityLoop->>Script: Update()
  Script->>Script: FindBestTarget()
  Script->>Script: HandleTargetSwitch(prev/current, timestamps)
  Script->>Script: TickBlendOnce()
  Script->>Script: AccumulateBlend()
  Script->>Anim: ApplyBlend(reachX, reachY)

  UnityLoop->>Script: LateUpdate()
  Script->>Script: Decay previousTarget if elapsed > 0.5s

  Note over Script,Anim: OnObjectHeld/OnObjectPlaced adjust held list and Anim "objectsHeld"

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I twitch my ears at CI’s bright light,
Hop through prompts and tests at night.
A Unity script, long as clover rows,
Targets found, animator knows.
Logs like carrots neatly stacked—
JUnit trails mark every act.
Thump! The suite returns, all packed.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 14a6cba and 88a2de2.

📒 Files selected for processing (3)

.claude/prompts/nl-unity-suite.md (1 hunks)
.github/workflows/claude-nl-suite.yml (1 hunks)
ClaudeTests/longUnityScript-claudeTest.cs (1 hunks)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chore/ci-claude-nl-workflow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

greptile-apps

Greptile Summary

This PR introduces a comprehensive automated testing framework for validating the Unity MCP's natural language interface capabilities with Claude Desktop. The changes add three key components that work together to create a robust CI testing environment:

Core Components Added:

Claude NL Test Suite (.claude/prompts/nl-unity-suite.md) - A detailed specification document defining 20+ test scenarios that validate MCP editing functionality through natural language commands. These tests cover everything from basic code modifications to complex error handling scenarios, ensuring the Unity-Claude integration works reliably across various use cases.
GitHub Actions Workflow (.github/workflows/claude-nl-suite.yml) - A sophisticated CI pipeline that orchestrates the entire testing process. The workflow sets up Python dependencies for the MCP server, executes Claude NL/T tests using Anthropic's code generation actions, and optionally compiles Unity projects to validate generated code. It includes intelligent project structure detection and proper cleanup mechanisms.
Large Unity Test Script (ClaudeTests/longUnityScript-claudeTest.cs) - A deliberately verbose 2000+ line Unity MonoBehaviour script designed to test Claude's ability to navigate and edit large codebases. The script includes 600+ padding methods to simulate real-world complexity while maintaining a functional core with target tracking and object management systems.

Integration with Existing Codebase:
This testing framework builds upon the existing Unity MCP bridge architecture found in the UnityMcpBridge directory. The workflow is designed to work with the MCP server components and validates the communication layer between Claude Desktop and Unity Editor tools. The manual trigger mechanism (workflow_dispatch) aligns with the project's approach to controlled, expensive AI-powered testing that complements the existing automated Unity tests in .github/workflows/unity-tests.yml.

The test suite follows established patterns from the existing codebase, using proper Unity conventions and maintaining the same structure as other test assets in the TestProjects directory.

Important Files Changed

Files Modified

Filename	Score	Overview
`.github/workflows/claude-nl-suite.yml`	2/5	Adds CI workflow for Claude NL testing but has critical path configuration issues
`.claude/prompts/nl-unity-suite.md`	4/5	Comprehensive test specification for Claude-Unity MCP validation with proper safety measures
`ClaudeTests/longUnityScript-claudeTest.cs`	5/5	Large Unity test script with intentional verbosity for AI editing validation

Confidence score: 2/5

This PR requires careful review due to critical configuration issues that will prevent the workflow from executing successfully
Score lowered due to incorrect MCP server path references and potential missing dependencies that could cause immediate CI failures
Pay close attention to .github/workflows/claude-nl-suite.yml which references non-existent server paths and may have dependency resolution issues

Sequence Diagram

sequenceDiagram
    participant User
    participant "GitHub Actions" as GHA
    participant "Claude Code Base Action" as Claude
    participant "Unity MCP Server" as MCP
    participant "Unity Test File" as TestFile
    participant "Test Reporter" as Reporter

    User->>GHA: "Trigger workflow_dispatch"
    GHA->>GHA: "Checkout repository"
    GHA->>GHA: "Install Python + uv"
    GHA->>GHA: "Install UnityMcpServer dependencies"
    
    GHA->>Claude: "Run Claude NL/T test suite"
    Note over Claude: "Load prompt from .claude/prompts/nl-unity-suite.md"
    
    Claude->>MCP: "Start Unity MCP Server"
    Note over MCP: "python UnityMcpBridge/UnityMcpServer~/src/server.py"
    
    loop "For each test case (NL-0 through T-J)"
        Claude->>TestFile: "Read ClaudeTests/longUnityScript-claudeTest.cs"
        Claude->>MCP: "mcp__unity__* tool calls"
        Note over Claude,MCP: "replace_method, insert_method, delete_method, etc."
        MCP->>TestFile: "Apply structured edits"
        Claude->>TestFile: "Verify changes via windowed reads"
        Claude->>TestFile: "Revert changes for cleanup"
    end
    
    Claude->>Claude: "Generate JUnit XML report"
    Claude->>Claude: "Generate markdown summary"
    Claude-->>GHA: "Return test results"
    
    GHA->>GHA: "Upload JUnit artifacts"
    GHA->>Reporter: "Annotate PR with test results"
    
    alt "If Unity license available"
        GHA->>GHA: "Run Unity compile tests"
        Note over GHA: "Validate that edits don't break compilation"
    end
    
    GHA->>GHA: "Clean working tree (discard temp edits)"

_{3 files reviewed, 1 comment}

_{Edit Code Review Bot Settings | Greptile}

greptile-apps · 2025-08-19T21:05:59Z

.github/workflows/claude-nl-suite.yml

+              "mcpServers": {
+                "unity": {
+                  "command": "python",
+                  "args": ["UnityMcpBridge/UnityMcpServer~/src/server.py"]


logic: Path 'UnityMcpBridge/UnityMcpServer~/src/server.py' does not exist in the repository. The MCP server path needs to be corrected or the server files need to be added.

CI: add Claude NL suite workflow, prompt, and long Unity test script …

88a2de2

…on default branch for manual runs

greptile-apps bot reviewed Aug 19, 2025

View reviewed changes

dsarno merged commit 3f9f9a3 into main Aug 19, 2025
1 of 2 checks passed

coderabbitai bot mentioned this pull request Aug 20, 2025

Feat/nl edits ai planner prep #26

Closed

dsarno deleted the chore/ci-claude-nl-workflow branch August 20, 2025 16:21

This was referenced Aug 29, 2025

Framed transport + Claude‑friendly edit tools + live Unity NL test framework #42 #56

Closed

Sync upstream main 20250830 132219 #57

Closed

Fix/brace validation improvements #60

Closed

coderabbitai bot mentioned this pull request Sep 6, 2025

Nl ci workflow fixes #62

Closed

coderabbitai bot mentioned this pull request Oct 31, 2025

Server: add robust shutdown on stdio detach (signal handlers, stdin/p… #91

Closed

coderabbitai bot mentioned this pull request Dec 10, 2025

Unity tests fork backup #103

Closed

This was referenced Jan 5, 2026

🎮 GameObject Toolset Redesign and Streamlining #123

Closed

🔧 Consolidate Shared Services Across MCP Tools #125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: add Claude NL suite workflow, prompt, and long Unity test script … #30

CI: add Claude NL suite workflow, prompt, and long Unity test script … #30

Uh oh!

dsarno commented Aug 19, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 19, 2025 •

edited

Loading

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Aug 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CI: add Claude NL suite workflow, prompt, and long Unity test script … #30

CI: add Claude NL suite workflow, prompt, and long Unity test script … #30

Uh oh!

Conversation

dsarno commented Aug 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Important Files Changed

Confidence score: 2/5

Sequence Diagram

Uh oh!

greptile-apps bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dsarno commented Aug 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 19, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)