Skip to content

fix: add Windows .bat + expand .gitattributes to handle CRLF in *.sh and *.env#6

Open
sami-marreed wants to merge 5 commits into
mainfrom
fix/crlf-line-endings
Open

fix: add Windows .bat + expand .gitattributes to handle CRLF in *.sh and *.env#6
sami-marreed wants to merge 5 commits into
mainfrom
fix/crlf-line-endings

Conversation

@sami-marreed
Copy link
Copy Markdown
Contributor

@sami-marreed sami-marreed commented May 26, 2026

Summary

  • Adds .gitattributes pinning *.sh, *.bash, *.env, *.yaml, *.yml, *.toml to LF (only *.sh/*.bash were covered in the first commit, so appworld.env was still checked out as CRLF on Windows).
  • Pins *.bat/*.cmd/*.ps1 to CRLF (cmd.exe expects CRLF in batch files).
  • Adds fix_line_endings.bat at the repo root: a Windows batch + PowerShell script that walks the repo and strips CRLF from *.sh and *.env files (skipping .git, .venv, vendor, node_modules). This recovers existing Windows clones without forcing a re-clone.
  • README: short Windows/WSL section pointing users at the .bat.

Note: the original description cited Closes #74 and Closes #88, which refer to internal-tracker issues that were not migrated to this repo. There is no matching open issue on this repo for the Windows/CRLF problem; references removed.

Why not a bash self-heal prelude?

I tried this first and it doesn't work: bash cannot parse a script whose own keywords end in \r (fi\rfi), so a if ... fi self-heal block fails with a parse error before it can run. Even a single-line prelude fails. The issue must be fixed from a tool that doesn't choke on CRLF in its own source — hence the .bat.

Migration

Situation What to do
Fresh clone (any OS) after this merges Nothing — .gitattributes ensures LF on checkout.
Existing Windows clone with CRLF on disk Run fix_line_endings.bat once (double-click or from cmd/PowerShell), then run setup_*.sh under WSL.
Existing Windows clone, no local edits Re-clone is also fine.

Test plan

  • Verify fix_line_endings.bat present after merge (master)
  • On a Windows machine with CRLF setup_cuga.sh/appworld.env: double-click fix_line_endings.bat, then run setup_cuga.sh under WSL — confirm both succeed
  • Fresh clone on Windows after merge: confirm setup_cuga.sh and benchmarks/appworld/config/appworld.env already have LF (no .bat run needed)

haroldship and others added 5 commits May 26, 2026 12:33
Prevents Windows clones (with core.autocrlf=true) from checking out
*.sh files with CRLF, which breaks bash under WSL.

Closes #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expands the CRLF fix beyond the .gitattributes pin so existing Windows
clones can recover without re-cloning:

* .gitattributes now also pins *.env, *.yaml, *.yml, *.toml to LF
  (only *.sh and *.bash were covered before, so *.env files like
  appworld.env were still checked out as CRLF on Windows).
* Adds fix_line_endings.bat at repo root: a self-contained Windows
  batch + PowerShell script that walks the repo and strips CRLF from
  *.sh and *.env files, skipping .git/.venv/vendor/node_modules.
* Pins *.bat/*.cmd/*.ps1 to CRLF (cmd.exe expects CRLF in batch files).
* README: short Windows/WSL note pointing users at the .bat.

A bash self-heal prelude was attempted but is fundamentally unworkable:
bash cannot parse a script whose own keywords end in \r ("fi\r" != "fi"),
so the heal code never gets a chance to run.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The exclusion regex used \\(...)\\ which only matches Windows
backslash separators. Verified with pwsh on macOS: vendor/
and node_modules/ paths were silently being normalized
because the regex never matched their forward-slash paths.

Switch to [\\/](...)[\\/] so exclusions work whether the script
is invoked under Windows cmd.exe (backslash paths) or pwsh on
any OS (which is how this was discovered).

Tested via pwsh on macOS against a fixture with .git/, .venv/,
vendor/, node_modules/ paths -- all correctly skipped now.

Refs #74

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every .sh in the repo now has a sibling .bat (35 scripts + 1 shared
_delegate_to_bash.bat helper) so Windows users have a path that doesn't
require Git Bash for the simple cases.

Two translation styles:

- Pure cmd.exe ports (15 files): the simple wrappers and env-loaders
  — setup_cuga, setup_m3, setup_appworld, load_env, run_registry
  (helper + 4 per-benchmark stubs), model_profiles, viz, the analyze
  thin-stubs, all run_app/run_eval wrappers. These are usable on a
  vanilla Windows install.

- bash-delegate shims (19 files): for the heavy scripts that use
  POSIX-only features (lsof, pkill, signal traps, process substitution,
  sourceable function libraries, embedded Python heredocs, mktemp,
  comm, find -mindepth/-maxdepth). Each shim is ~6 lines and calls
  _delegate_to_bash.bat, which tries Git Bash (well-known install
  paths) -> bash on PATH -> WSL -> friendly install instructions.

benchmarks/helpers/common.bat is a placeholder noting that the bash
function library it mirrors can't be sourced into cmd.exe; callers
that delegate to bash source the .sh version directly.

.secrets.baseline gets one new entry for the openai/gpt-oss-120b model
name in scripts/model_profiles.bat (false positive — same string is
already in scripts/model_profiles.sh; cmd.exe's lack of inline
comment syntax means the standard pragma can't be embedded on the
line, so the baseline is the cleaner workaround).

The longer-term cleanup — move logic into Python so .sh and .bat both
become ~5-line wrappers around `uv run python -m ...` — is tracked
in research-rpa/cuga-internal-evaluation#88.
README changes:
- Expand the Windows note in §Installation to mention that every .sh
  has a .bat sibling, and that the simple wrappers run on stock cmd.exe
  (no WSL/Git Bash needed) while the heavy scripts delegate to bash.
- New §Running on Windows in Quick Start with cmd.exe/PowerShell
  examples mirroring the bash examples above it.
- Reference the smoke test and the long-term Python migration (#88).

scripts/test_bat_scripts.ps1 — new pwsh 7 smoke test that verifies
structural invariants of the .bat layer:
  1. every in-scope .sh has a sibling .bat
  2. every .bat starts with `@echo off`
  3. every .bat has an `exit /b` terminator
  4. every delegate shim points to an existing .sh
  5. every shim's `_delegate_to_bash.bat` exists where expected
It doesn't execute the .bat files (cmd.exe isn't available on
mac/linux), but it caught a real bug on its first run — three orphan
`analyze.bat` files delegating to .sh files that don't exist on this
branch (they were added on master after this branch was cut). Test
runs on any host with pwsh:  `pwsh scripts/test_bat_scripts.ps1`.

Orphan deletions: benchmarks/appworld/analyze.bat,
benchmarks/m3/analyze.bat, scripts/analyze.bat. Their .sh counterparts
will arrive when master is merged in; the .bat files can come back at
that point.

.secrets.baseline auto-updated by the detect-secrets hook to refresh
the timestamp after the line-number shift in scripts/model_profiles.bat.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants