Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
b741d3b
Add language-agnostic emitter-diff tool + python adapter
l0lawrence Jun 30, 2026
dadc841
Add approved-baseline gate to emitter-diff CI
l0lawrence Jun 30, 2026
244f134
Add gpgsign to cspell word list
l0lawrence Jun 30, 2026
b996a66
Make diff2html import resilient to aggregate typecheck
l0lawrence Jun 30, 2026
c7d896f
Fix batch Python phase to honor --generatedFolder override
l0lawrence Jun 30, 2026
1b64aa7
Generate baseline and head in parallel by default
l0lawrence Jun 30, 2026
ce0ab95
TEMP: point emitter-diff baseline at older commit for demo
l0lawrence Jun 30, 2026
4fca715
Report emitter-diff hard errors distinctly in the PR comment
l0lawrence Jun 30, 2026
943d24a
DEMO: sample emitter output change to exercise the diff gate
l0lawrence Jun 30, 2026
2ab0428
Preserve the emitter-diff exit code in CI (avoid pnpm code masking)
l0lawrence Jun 30, 2026
4d2044d
Harden emitter-diff secondary paths (refs, specs, HTML, validation, d…
l0lawrence Jul 1, 2026
174b1cb
Fix emitter-diff path/dep handling for external baselines
l0lawrence Jul 1, 2026
a233572
Drop per-file list from emitter-diff PR comment; link the HTML instead
l0lawrence Jul 1, 2026
17c65ee
Remove --vscode diff option from emitter-diff (HTML report is suffici…
l0lawrence Jul 1, 2026
7e27c74
Harden emitter-diff: fix CI injection, validate git refs, drop emojis
l0lawrence Jul 1, 2026
dcd4ca9
Replace committed baseline SHA with merge-base + PR-label approval
l0lawrence Jul 1, 2026
02a7cea
Make emitter-diff check informational (pass unless real error)
l0lawrence Jul 1, 2026
a1c0118
Simplify emitter-diff: zero-dep HTML report, drop run-tests
l0lawrence Jul 1, 2026
722e756
Run emitter-diff as plain node scripts, not a package
l0lawrence Jul 1, 2026
c2e2e99
Allow .ts extension imports in eng typecheck
l0lawrence Jul 1, 2026
abe62b0
Log baseline/head output dirs after generation
l0lawrence Jul 1, 2026
4601b99
Slim down emitter-diff CI workflow
l0lawrence Jul 2, 2026
02e6400
start addressing comments (now a default script for python but if wan…
l0lawrence Jul 2, 2026
88fff4f
Revert spec-dir resolution in regenerate.ts; keep only generatedFolde…
l0lawrence Jul 2, 2026
05e1bea
keep path diff for now
l0lawrence Jul 2, 2026
a56d50c
Merge remote-tracking branch 'origin/main' into l0lawrence-emitter-diff
l0lawrence Jul 2, 2026
9925d65
Fix emitter-diff CI: lint cause, spelling, changeset
l0lawrence Jul 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .chronus/changes/emitter-diff-tool-2026-7-2-0-0-0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
changeKind: internal
packages:
- "@typespec/http-client-python"
---

Add a `diff-spector-tests` script wired to the new language-agnostic emitter-diff tool, and honor `--generatedFolder` in the batch Python phase of `regenerate.ts`
122 changes: 122 additions & 0 deletions .github/workflows/ci-emitter-diff-python.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
name: "python / emitter diff"

# Diffs generated code between this PR's emitter and the merge-base baseline.
# Informational only: reports a sticky PR comment + HTML artifact, and fails the
# job only on a tool/build error (never on a diff). See eng/emitter-diff.

on:
pull_request:
branches: [main, release/*]
paths:
- "packages/http-client-python/**"
- "eng/emitter-diff/**"
- ".github/workflows/ci-emitter-diff-python.yml"
workflow_dispatch:
inputs:
baseline:
description: "Baseline emitter ref (npm version, local path, or github ref). Defaults to the PR merge-base."
required: false
default: ""

permissions:
contents: read
pull-requests: write

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
emitter-diff:
name: "Generate & Diff"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: ./.github/actions/setup
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install repo dependencies (emitter-diff tool)
run: pnpm install

- name: Determine baseline
id: baseline
# Dispatch input is untrusted: pass via env, reference only as "$VAR".
env:
BASELINE_INPUT: ${{ github.event.inputs.baseline || '' }}
BASE_REF: ${{ github.event.pull_request.base.ref || github.event.repository.default_branch }}
RUNNER_TEMP: ${{ runner.temp }}
run: |
input="$BASELINE_INPUT"
if [ -z "$input" ]; then
# merge-base = a real commit on the base branch; survives squash/rebase.
git fetch --no-tags origin "$BASE_REF"
base_sha="$(git merge-base FETCH_HEAD HEAD)"
[ -n "$base_sha" ] || { echo "::error::No merge-base with $BASE_REF."; exit 1; }
input="gh:$base_sha"
echo "sha=$base_sha" >> "$GITHUB_OUTPUT"
fi
echo "ref=$input" >> "$GITHUB_OUTPUT"
echo "Baseline: $input"

- name: Run emitter diff
id: diff
working-directory: packages/http-client-python
env:
BASELINE_REF: ${{ steps.baseline.outputs.ref }}
RUNNER_TEMP: ${{ runner.temp }}
run: |
set +e
# Script wraps emitter-diff entrypoint. No --fail-on-diff (informational);
# tool/build errors still exit non-zero (checked in "Fail on tool error").
npm run diff-spector-tests -- \
--ci \
--baseline "$BASELINE_REF" \
--html "$RUNNER_TEMP/emitter-diff.html" \
| tee "$RUNNER_TEMP/emitter-diff.log"
echo "status=${PIPESTATUS[0]}" >> "$GITHUB_OUTPUT"
# Reuse the tool's own summary line (strip ANSI) instead of re-parsing.
summary="$(sed -r 's/\x1b\[[0-9;]*m//g' "$RUNNER_TEMP/emitter-diff.log" \
| grep -oE 'Diff summary: [0-9]+ file\(s\), \+[0-9]+ / -[0-9]+' | head -1)"
echo "summary=${summary:-No changes to generated output.}" >> "$GITHUB_OUTPUT"

- name: Upload HTML diff
if: always()
uses: actions/upload-artifact@v7
with:
name: emitter-diff-html
path: ${{ runner.temp }}/emitter-diff.html
if-no-files-found: ignore
retention-days: 7

- name: Comment on PR
if: always() && github.event_name == 'pull_request'
continue-on-error: true
uses: actions/github-script@v7
env:
BASELINE: ${{ steps.baseline.outputs.sha || steps.baseline.outputs.ref }}
STATUS: ${{ steps.diff.outputs.status }}
SUMMARY: ${{ steps.diff.outputs.summary }}
with:
script: |
const marker = "<!-- emitter-diff-python -->";
const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`;
const { STATUS, SUMMARY, BASELINE } = process.env;
let body = `${marker}\n### Python emitter diff\nBaseline \`${BASELINE}\` vs this PR.\n\n`;
body += STATUS !== "0"
? `⚠️ **emitter-diff failed** (exit \`${STATUS}\`) — tool/build error, not a diff. See the [run](${runUrl}).\n`
: `${SUMMARY}\n\nFull rendered diff: **emitter-diff-html** artifact on the [run](${runUrl}).\n`;
body += `\n_Informational check (eng/emitter-diff); does not block the PR._`;
const { owner, repo } = context.repo, issue_number = context.issue.number;
const { data } = await github.rest.issues.listComments({ owner, repo, issue_number });
const hit = data.find((c) => c.body && c.body.includes(marker));
if (hit) await github.rest.issues.updateComment({ owner, repo, comment_id: hit.id, body });
else await github.rest.issues.createComment({ owner, repo, issue_number, body });

- name: Fail on tool error
if: always()
env:
STATUS: ${{ steps.diff.outputs.status }}
run: '[ "$STATUS" = "0" ] || { echo "::error::emitter-diff failed (exit $STATUS)."; exit 1; }'
2 changes: 2 additions & 0 deletions cspell.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ words:
- getpgid
- ghapp
- giacamo
- gpgsign
- graalvm
- Gson
- Hdvcmxk
Expand Down Expand Up @@ -155,6 +156,7 @@ words:
- MACVMIMAGEM
- marshal
- mday
- Menlo
- methodsubscriptionid
- mgmt
- mgmtplane
Expand Down
125 changes: 125 additions & 0 deletions eng/emitter-diff/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# emitter-diff

A language-agnostic tool for **diffing the generated code produced by two versions of a
TypeSpec emitter**.

It generates code from the test specs twice (a **baseline** emitter and a **head** emitter),
then shows the diff between the two outputs. Use it locally during development and in CI on PRs.

Each language emitter plugs in via a small **adapter** that
wraps that emitter's own generation command. The core (ref resolution, diffing, orchestration)
contains no language-specific logic.

## How it works

```text
baseline emitter ─┐
├─ generate ─► <work>/baseline ─┐
specs ─────────────────────┤ ├─► git diff ─► terminal / HTML
├─ generate ─► <work>/head ─────┘
head emitter ─────┘
```

- The **adapter** wraps the emitter's existing commands. For python that is
`packages/http-client-python/eng/scripts/ci/regenerate.ts` (generation).
- The regenerate _driver_ always comes from the current checkout; only the emitter build it points
at (`--pluginDir`) changes between baseline and head, isolating the diff to emitter behavior.

## Usage

```bash
# Run directly with Node 24+ (native TypeScript, no build step, no dependencies):
node eng/emitter-diff/src/cli.ts --emitter python
```

> This tool is a set of plain `.ts` scripts — not an installed package. Node 24 runs TypeScript
> natively, so there is nothing to build or install. Typecheck with `npx tsc -p eng/emitter-diff`.

### Refs

#### Emitter refs (`--baseline`, `--head`)

| Syntax | Meaning |
| --------------------------------- | -------------------------------------- |
| `npm:1.2.3` or `1.2.3` | a published package version (prebuilt) |
| `local:/path` or `./path` | a local source folder |
| `github:owner/repo@<sha\|branch>` | a GitHub source at a ref |
| `gh:<sha\|branch>` | `microsoft/typespec` at a ref |

`--head` defaults to the **current checkout**. `--baseline` defaults to `gh:upstream/main` when present, otherwise `gh:origin/main` (`origin/HEAD` as a fallback).

Use `--name` to filter which packages/specs are generated.

### Common options

By default the tool writes a **clickable HTML report** (`emitter-diff.html`) into the work dir and
prints a `file://` link to it. Use `--html` to write to a specific file.

- `--name <pattern>`: Filter which specs/packages are generated.
- `--html <file>`: Write the rendered HTML report to this path (default: `<work-dir>/emitter-diff.html`).
- `--generated-code-path <path>`: Override the adapter generated-code subpath under each side output root.
For Python, generated files land under `<side>/<path>/tests/generated/...`.
- `--fail-on-diff`: Exit non-zero when output differs (CI gating).

### Examples

```bash
# Explicit baseline version:
node eng/emitter-diff/src/cli.ts --emitter python --baseline 0.34.0 --name authentication-api-key

# Compare two source folders and write an HTML report to a specific path:
node eng/emitter-diff/src/cli.ts --emitter python --baseline local:/path/to/old/http-client-python \
--head local:/path/to/new/http-client-python \
--html diff.html

# Place generated output under a custom subpath for each side (`<side>/generator/tests/generated/...`):
node eng/emitter-diff/src/cli.ts --emitter python --generated-code-path generator

# Diff against a GitHub sha:
node eng/emitter-diff/src/cli.ts --emitter python \
--baseline github:microsoft/typespec@<sha>
```

## CI integration

`.github/workflows/ci-emitter-diff-<lang>.yml` runs on PRs that touch the language emitter or this tool. The **baseline** is the base-branch commit the PR is based on (the `git merge-base` with the target branch). CI builds the emitter for both the PR's checkout and a worktree
of that merge-base commit, diffs the two, and then:

- posts a **sticky PR comment** (updated in place on each push) with the link to download the diff artifact.

**Informational:** the check **always passes unless the tool hits a real tool/build error** — a
generated-output diff does not fail the PR. CI runs the tool without `--fail-on-diff`, so a diff
still exits `0`; only a non-zero exit (a build/venv/generate failure) fails the job. Reviewers use
the PR comment and the HTML artifact to eyeball the diff.

The comment step needs `pull-requests: write`. PRs **from forks** get a read-only token, so the
comment is best-effort there (`continue-on-error`) — the artifact and job-summary still work.

## Adding a new language adapter

1. Implement `EmitterAdapter` (`src/types.ts`) — `prepareEmitter` and `generate` — wrapping that
emitter's own commands (e.g. rust's `npm run tspcompile`, ts's `gen-spector`).
2. Register it in `src/registry.ts`.

That's the only wiring needed — the orchestrator, ref resolver, and diff engine are
language-agnostic and require no changes.

## Notes & limitations

- For the current python regenerate flow, spec inputs come from each side's
dependencies (`node_modules/@typespec/http-specs` and
`node_modules/@azure-tools/azure-http-specs`). Use `--name` to filter.
- `--html` renders a self-contained, GitHub-style HTML report (inline CSS, no external requests).
The tool has **zero runtime dependencies** — the diff itself is `git diff --no-index`.
- For github refs (including fork repos), the resolver uses a detached, commit-keyed cached git
worktree under the temp directory, created from an isolated cache repo (not from your active
checkout). That keeps each version's build/venv isolated and allows repeated runs on the same
commit to reuse existing setup work.
- **Python's native two-phase pipeline needs a venv per emitter version.** `regenerate.ts`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could also use worktree for local diff and there are 2 pros:

  1. we don't need care about venv since worktree is in independent worktree
  2. we could cache commit SHA for worktree then there is no need to regenerate for baseline after running once.

compiles TypeSpec to YAML in-process and then runs a batched Python subprocess
(`eng/scripts/setup/run_batch.py`) that writes the `.py` files using a venv co-located with the
emitter at `<emitter>/venv`. The adapter therefore builds **and** creates a venv (`npm run
install`) for each side before generating. Because each version's venv is built from that
version's generator, a baseline must be a buildable **source** ref (`local`/`github`) or an npm
version whose package can run `npm run install`; the CI workflow uses a worktree of the PR base
commit as the baseline.
Loading
Loading