Add language-agnostic emitter-diff tool + python adapter#11122
Add language-agnostic emitter-diff tool + python adapter#11122l0lawrence wants to merge 27 commits into
Conversation
Ports the eng/emitter-diff tool: diffs generated code between two emitter versions (npm version, local folder, or github ref) and optionally runs the emitter's generated-code test suites. Ships the python adapter for @typespec/http-client-python, driving its native two-phase regenerate pipeline with a venv per emitter version. - eng/emitter-diff: @typespec/emitter-diff package (CLI, ref resolver, diff engine with HTML/VS Code/terminal output, adapter registry). - regenerate.ts: add --httpSpecsDir, --azureSpecsDir, --no-baseline flags. - pnpm-workspace.yaml: add eng/emitter-diff member + diff2html catalog entry. - ci-emitter-diff-python.yml: PR workflow that diffs head vs the PR base commit, uploads the rendered HTML artifact, and posts a sticky PR comment. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
CI now diffs the PR's emitter against the emitter at the commit recorded in eng/emitter-diff/baselines/python.sha and fails when generated output differs. To approve an intended change, update that SHA to a commit on the branch that contains the emitter changes and push; once the baseline matches head the diff is empty and the check passes. - cli.ts: --fail-on-diff now exits 2 for "diff present" (vs 1 for hard errors) so CI can distinguish an unapproved diff from a failure. - baselines/python.sha: the approved baseline commit. - ci-emitter-diff-python.yml: read the SHA file as baseline, run with --fail-on-diff, enforce the gate, and explain approval in the summary/comment. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Use a non-literal import specifier + local type so a typecheck that doesn't install this package's deps (e.g. the parent repo's check:eng, which includes core/eng) doesn't fail with TS2307; availability is validated at runtime. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
run_batch.py was always passed PLUGIN_DIR as --generated-dir, so when the emitter-diff tool redirects output via --generatedFolder the batch phase looked in the wrong tree, found no .tsp-codegen config files, and wrote zero .py files (leaving the path-bearing config files behind as spurious diffs). Pass the parent of GENERATED_FOLDER instead; it equals PLUGIN_DIR in the default case so normal regeneration is unchanged. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The two sides write to isolated output dirs, use separate venvs, and emit uniquely-named temp YAML, so they can regenerate concurrently. Run them with Promise.all and tag each side's streamed output with a [baseline]/[head] prefix so the interleaved logs stay attributable. Add --sequential to opt back into one-at-a-time generation (quieter logs, lower peak CPU/memory). Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Temporarily set the approved baseline to #10947 (timedelta duration encoding) so the diff PR comment and approval gate can be exercised end-to-end. Revert to the blessed HEAD SHA after the demo. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
commit: |
|
All changed packages have been documented.
Show changes
|
Python emitter diffBaseline Diff summary: 400 file(s), +400 / -0 Full rendered diff: emitter-diff-html artifact on the run. Informational check (eng/emitter-diff); does not block the PR. |
A non-zero exit that is not the dedicated 'diff present' code (2) means the tool failed to run (build/venv/generate threw), not that output is unchanged. Drive the comment off the diff step's exit status so a hard error shows a clear failure notice instead of a misleading 'no changes'. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Revert the approved baseline to the blessed HEAD SHA and add a one-line comment to the generated package __init__.py so the PR's generated output differs from the baseline, exercising the emitter-diff PR comment + approval gate end to end. Both changes are temporary and must be reverted before merge. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
|
You can try these changes here
|
pnpm's recursive 'pnpm --filter <pkg> exec' collapses any non-zero child exit into its own generic exit 1 (ERR_PNPM_RECURSIVE_EXEC_FIRST_FAIL), which masked the tool's dedicated 'diff present' code (2) as a hard error (1). Run the CLI from the package directory via plain (non-recursive) 'pnpm exec' so the real exit code propagates and the approval gate can distinguish a diff from a crash. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
…ocs) - resolver: on the full-fetch fallback, check out a github ref by name rather than FETCH_HEAD (which could be a branch head), so a SHA baseline/head never silently diffs the wrong commit. - resolver: gh:<ref> now resolves the origin repo from the local remote instead of hard-coding microsoft/typespec, so forks/renames work. - python adapter: recognize packages/http-specs and packages/azure-http-specs layouts for external --specs, and warn (instead of silently falling back to the current checkout) when an explicit --specs dir can't be resolved. - diff: --html always writes a file (a 'No differences' stub when unchanged) and inlines the diff2html stylesheet so the CI artifact renders offline. - cli: reject an invalid --test-target instead of silently running no suites. - docs: fix the broken github-sha example, note --specs doesn't accept npm refs, soften the 'no core changes' wording, and correct the stale workflow comment. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
- cli.ts: absolutize --work-dir (resolve, not bare join). The python adapter runs regenerate.ts with a different cwd, so a relative work-dir made --generatedFolder/--pluginDir resolve against the wrong directory: outputs landed elsewhere while the diff engine compared empty dirs -> silent false 'no differences'. Now every path handed to an adapter is absolute. - cli.ts: reject --specs npm: before building emitters (fast-fail). - python.ts: ensureDeps() runs 'npm ci' when node_modules is missing so a fresh gh:/github: clone builds instead of failing with 'cannot find module'. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The sticky comment now shows only the one-line summary (file count, +/-) and the emitter-diff-html artifact link, not an enumerated list of changed files. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
…ent) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Security: - Pass the untrusted workflow_dispatch baseline input (and derived values) via env vars instead of interpolating GitHub Actions expressions into run: bash (Actions expression injection) - Validate git ref/repo and use "git checkout --end-of-options" (arg injection) - Add "npm install --ignore-scripts" when materializing a baseline package - Drop the remote CDN CSS fallback so the HTML report never fetches off-box Cleanup: - De-duplicate the two HTML report templates into one htmlDoc() helper - Remove emojis from CLI output, job summary, and PR comment Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The baseline is now the git merge-base with the PR base branch instead of a pinned SHA file, so it survives squash-merge / rebase / force-push. Diff approval is granted by adding the `emitter-diff-approved` label to the PR; labeled/unlabeled events re-run the check. Removes eng/emitter-diff/baselines/python.sha. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The workflow no longer fails on a generated-output diff or gates on a label. It runs the tool without --fail-on-diff, so a diff exits 0 and only a real tool/build error fails the job. Diffs are reported via job summary, a sticky PR comment, and the HTML artifact. Drops the label trigger, the approval gate, and the emitter-diff-approved label requirement. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
- Replace diff2html with a self-contained, GitHub-style HTML renderer (inline CSS, light/dark, no external requests). Runtime deps now empty. - Remove diff2html from package.json, pnpm catalog, and lockfile. - Drop the --run-tests/--test-env/--test-target surface and the adapter runTests contract; parallel generation and --sequential are unchanged. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Now that the tool has zero runtime dependencies, it no longer needs to be a workspace package. Deleting the package.json and its name removes any npm dependency-confusion surface, and the tool runs with plain ode (Node 24 strips TypeScript natively) instead of tsx. - Delete eng/emitter-diff/package.json; drop it from pnpm-workspace.yaml. - Rewrite relative imports to .ts extensions and set allowImportingTsExtensions so ode runs it and sc typechecks. - Invoke via ode src/cli.ts in CI, the shebang, and the README. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Running emitter-diff with plain node requires .ts import specifiers, but the repo-wide check:eng gate (tsconfig.eng.json, --noEmit) rejected them. Enable allowImportingTsExtensions there so the tool stays covered by the existing typecheck. Also reword the ensureDeps comment: http-client-python commits its own package-lock.json to the repo (lockfiles aren't in the published tarball), which is why npm ci works on a source checkout. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
A local run now prints the exact generated trees the diff compares, so developers can open the head (current checkout) output directly instead of inferring it from the work dir. Language-agnostic: lives in cli.ts, labeled per side by the resolved emitter. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
|
should emitter-diff live under eng/common bc we need this in tsp-azure too (like for rust/js/go etc) |
Cut the python workflow to ~half its size: - trim explanatory comment headers - drop the hand-rolled github-script patch parser; reuse the tool's own 'Diff summary:' line (ANSI-stripped) as a step output for the PR comment - drop the redundant job-summary step and the now-unused --patch output - collapse 'Fail on tool error' to a one-liner Behavior is unchanged: sticky PR comment, HTML artifact, informational-only (fails only on a tool/build error, never on a diff). Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
| - External `--specs` folders must mirror the `http-specs` / `azure-http-specs` layout. | ||
| - `--html` renders a self-contained, GitHub-style HTML report (inline CSS, no external requests). | ||
| The tool has **zero runtime dependencies** — the diff itself is `git diff --no-index`. | ||
| - **Python's native two-phase pipeline needs a venv per emitter version.** `regenerate.ts` |
There was a problem hiding this comment.
Maybe we could also use worktree for local diff and there are 2 pros:
- we don't need care about
venvsince worktree is in independent worktree - we could cache commit SHA for worktree then there is no need to regenerate for baseline after running once.
|
|
||
| ```bash | ||
| # Run directly with Node 24+ (native TypeScript, no build step, no dependencies): | ||
| node eng/emitter-diff/src/cli.ts --emitter python --baseline 0.34.0 |
There was a problem hiding this comment.
I think cli.ts shall have an option like --generated-code-path so that language emitter could set specific folder.
| ${pc.cyan("-j, --jobs <n>")} | ||
| Number of parallel compilation tasks (default: 30 on Linux/Mac, 10 on Windows). | ||
|
|
||
| ${pc.cyan("--httpSpecsDir <dir>")} |
There was a problem hiding this comment.
I expect there shall be no change for regenerate.ts.
The ideal usage is: all language emitter just need to add a new command like spector-code-diff: node eng/emitter-diff/src/cli.ts --command "npm run regenerate" --emitter python --emitter-path packages/http-client-python --generated-spector-code-path XXX --baseline github:microsoft/typespec@ ...
In cli.ts, it will step into emitter path and run command to regenerate then extract from generated code of --generated-spector-code-path.
In one word, main work shall be in emitter-diff script and language emitters just need to add a new command to call the emitter-diff script to provide necessary command.
| | Option | Description | | ||
| | ------------------ | ------------------------------------------------------------------------------------- | | ||
| | `--name <pattern>` | Filter which specs/packages are generated | | ||
| | `--html <file>` | Write the rendered HTML report to this path (default: `<work-dir>/emitter-diff.html`) | | ||
| | `--terminal` | Print the full colored patch to the terminal instead | | ||
| | `--patch <file>` | Write the raw unified diff to a file | | ||
| | `--fail-on-diff` | Exit non-zero when output differs (CI gating) | | ||
| | `--opt key=value` | Repeatable adapter-specific option (e.g. `--opt flavor=azure`) | | ||
| | `--sequential` | Generate baseline then head one at a time (default: both in parallel) | | ||
| | `-- <args>` | Everything after `--` is forwarded to the adapter | |
There was a problem hiding this comment.
At first version, I think there is no need to add too many customized options. Just call the regenerate command of language emitters then show the diff. And we could add more command option in the future if needed.
…t diff configs can run script directly in terminal) -- need to check on cache implementation locally
…r fix Spec versions rarely change, so pinning specs across baseline/head is unnecessary. Revert resolveSpecsDir + spec existence warnings back to main (hardcoded PLUGIN_DIR node_modules specs). Keep the batch-Python change so run_batch.py honors --generatedFolder, which the emitter-diff tool relies on to isolate baseline/head output trees. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
- resolver.ts: attach `cause` to re-thrown manifest parse error (oxlint preserve-caught-error) - README: replace nonsense `libba` placeholder with a real example - cspell: allow `Menlo` (font name in the HTML report CSS) - add chronus changeset for @typespec/http-client-python Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
What
A language-agnostic tool,
eng/emitter-diff(@typespec/emitter-diff), that diffs the generated code produced by two versions of a TypeSpec emitter and optionally runs the emitter's generated-code test suites. It resolves a baseline emitter from an npm version, a local folder, or a GitHub url/sha, generates with both baseline and head, and renders the diff as a clickable HTML report (default), a VS Code diff, or a terminal patch.Ships the python adapter for
@typespec/http-client-python, which drives the package's existingregenerate.tstwo-phase native pipeline (TypeSpec → YAML, then a batched Python subprocess), building and creating a venv per emitter version.Changes
eng/emitter-diff/— the tool: CLI/orchestrator, ref resolver (npm/local/github), genericEmitterAdapterinterface + registry, diff engine (HTML/VS Code/terminal), python adapter. Baseline and head generate in parallel by default (prefixed logs;--sequentialto opt out).packages/http-client-python/eng/scripts/ci/regenerate.ts— adds--httpSpecsDir,--azureSpecsDir,--no-baselineso the tool can pin specs and skip the published-baseline clone; and makes the batched Python phase honor a custom--generatedFolder. Defaults preserve current behavior.pnpm-workspace.yaml— addseng/emitter-diffas a workspace member anddiff2htmlto the catalog (lockfile updated)..github/workflows/ci-emitter-diff-python.yml— PR workflow with an approved-baseline gate.Approved-baseline gate
The approved generated-output baseline is a commit SHA in
eng/emitter-diff/baselines/python.sha. CI builds the emitter for both the PR checkout and a worktree of that commit, diffs them, uploads the HTML artifact, and posts a sticky PR comment. If the output differs, the run exits2and the job fails. To approve an intended change, update the SHA to a commit on your branch containing your emitter changes and push — once the baseline matches head, the diff is empty and the check passes.Notes / open questions
npm run installs each side.--use-pyodidefrom the azure variant is dropped (no pyodide path in core).Azure/typespec-azure(Support generating Python SDK with Pyodide #4784).Supersedes the fork-based draft #11119 (closed); moved here so the CI PR-comment + gate run with a read-write token.