-
Notifications
You must be signed in to change notification settings - Fork 374
Add language-agnostic emitter-diff tool + python adapter #11122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
l0lawrence
wants to merge
27
commits into
main
Choose a base branch
from
l0lawrence-emitter-diff
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
b741d3b
Add language-agnostic emitter-diff tool + python adapter
l0lawrence dadc841
Add approved-baseline gate to emitter-diff CI
l0lawrence 244f134
Add gpgsign to cspell word list
l0lawrence b996a66
Make diff2html import resilient to aggregate typecheck
l0lawrence c7d896f
Fix batch Python phase to honor --generatedFolder override
l0lawrence 1b64aa7
Generate baseline and head in parallel by default
l0lawrence ce0ab95
TEMP: point emitter-diff baseline at older commit for demo
l0lawrence 4fca715
Report emitter-diff hard errors distinctly in the PR comment
l0lawrence 943d24a
DEMO: sample emitter output change to exercise the diff gate
l0lawrence 2ab0428
Preserve the emitter-diff exit code in CI (avoid pnpm code masking)
l0lawrence 4d2044d
Harden emitter-diff secondary paths (refs, specs, HTML, validation, d…
l0lawrence 174b1cb
Fix emitter-diff path/dep handling for external baselines
l0lawrence a233572
Drop per-file list from emitter-diff PR comment; link the HTML instead
l0lawrence 17c65ee
Remove --vscode diff option from emitter-diff (HTML report is suffici…
l0lawrence 7e27c74
Harden emitter-diff: fix CI injection, validate git refs, drop emojis
l0lawrence dcd4ca9
Replace committed baseline SHA with merge-base + PR-label approval
l0lawrence 02a7cea
Make emitter-diff check informational (pass unless real error)
l0lawrence a1c0118
Simplify emitter-diff: zero-dep HTML report, drop run-tests
l0lawrence 722e756
Run emitter-diff as plain node scripts, not a package
l0lawrence c2e2e99
Allow .ts extension imports in eng typecheck
l0lawrence abe62b0
Log baseline/head output dirs after generation
l0lawrence 4601b99
Slim down emitter-diff CI workflow
l0lawrence 02e6400
start addressing comments (now a default script for python but if wan…
l0lawrence 88fff4f
Revert spec-dir resolution in regenerate.ts; keep only generatedFolde…
l0lawrence 05e1bea
keep path diff for now
l0lawrence a56d50c
Merge remote-tracking branch 'origin/main' into l0lawrence-emitter-diff
l0lawrence 9925d65
Fix emitter-diff CI: lint cause, spelling, changeset
l0lawrence File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| changeKind: internal | ||
| packages: | ||
| - "@typespec/http-client-python" | ||
| --- | ||
|
|
||
| Add a `diff-spector-tests` script wired to the new language-agnostic emitter-diff tool, and honor `--generatedFolder` in the batch Python phase of `regenerate.ts` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| name: "python / emitter diff" | ||
|
|
||
| # Diffs generated code between this PR's emitter and the merge-base baseline. | ||
| # Informational only: reports a sticky PR comment + HTML artifact, and fails the | ||
| # job only on a tool/build error (never on a diff). See eng/emitter-diff. | ||
|
|
||
| on: | ||
| pull_request: | ||
| branches: [main, release/*] | ||
| paths: | ||
| - "packages/http-client-python/**" | ||
| - "eng/emitter-diff/**" | ||
| - ".github/workflows/ci-emitter-diff-python.yml" | ||
| workflow_dispatch: | ||
| inputs: | ||
| baseline: | ||
| description: "Baseline emitter ref (npm version, local path, or github ref). Defaults to the PR merge-base." | ||
| required: false | ||
| default: "" | ||
|
|
||
| permissions: | ||
| contents: read | ||
| pull-requests: write | ||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| emitter-diff: | ||
| name: "Generate & Diff" | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v6 | ||
| with: | ||
| fetch-depth: 0 | ||
| - uses: ./.github/actions/setup | ||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.12" | ||
| - name: Install repo dependencies (emitter-diff tool) | ||
| run: pnpm install | ||
|
|
||
| - name: Determine baseline | ||
| id: baseline | ||
| # Dispatch input is untrusted: pass via env, reference only as "$VAR". | ||
| env: | ||
| BASELINE_INPUT: ${{ github.event.inputs.baseline || '' }} | ||
| BASE_REF: ${{ github.event.pull_request.base.ref || github.event.repository.default_branch }} | ||
| RUNNER_TEMP: ${{ runner.temp }} | ||
| run: | | ||
| input="$BASELINE_INPUT" | ||
| if [ -z "$input" ]; then | ||
| # merge-base = a real commit on the base branch; survives squash/rebase. | ||
| git fetch --no-tags origin "$BASE_REF" | ||
| base_sha="$(git merge-base FETCH_HEAD HEAD)" | ||
| [ -n "$base_sha" ] || { echo "::error::No merge-base with $BASE_REF."; exit 1; } | ||
| input="gh:$base_sha" | ||
| echo "sha=$base_sha" >> "$GITHUB_OUTPUT" | ||
| fi | ||
| echo "ref=$input" >> "$GITHUB_OUTPUT" | ||
| echo "Baseline: $input" | ||
|
|
||
| - name: Run emitter diff | ||
| id: diff | ||
| working-directory: packages/http-client-python | ||
| env: | ||
| BASELINE_REF: ${{ steps.baseline.outputs.ref }} | ||
| RUNNER_TEMP: ${{ runner.temp }} | ||
| run: | | ||
| set +e | ||
| # Script wraps emitter-diff entrypoint. No --fail-on-diff (informational); | ||
| # tool/build errors still exit non-zero (checked in "Fail on tool error"). | ||
| npm run diff-spector-tests -- \ | ||
| --ci \ | ||
| --baseline "$BASELINE_REF" \ | ||
| --html "$RUNNER_TEMP/emitter-diff.html" \ | ||
| | tee "$RUNNER_TEMP/emitter-diff.log" | ||
| echo "status=${PIPESTATUS[0]}" >> "$GITHUB_OUTPUT" | ||
| # Reuse the tool's own summary line (strip ANSI) instead of re-parsing. | ||
| summary="$(sed -r 's/\x1b\[[0-9;]*m//g' "$RUNNER_TEMP/emitter-diff.log" \ | ||
| | grep -oE 'Diff summary: [0-9]+ file\(s\), \+[0-9]+ / -[0-9]+' | head -1)" | ||
| echo "summary=${summary:-No changes to generated output.}" >> "$GITHUB_OUTPUT" | ||
|
|
||
| - name: Upload HTML diff | ||
| if: always() | ||
| uses: actions/upload-artifact@v7 | ||
| with: | ||
| name: emitter-diff-html | ||
| path: ${{ runner.temp }}/emitter-diff.html | ||
| if-no-files-found: ignore | ||
| retention-days: 7 | ||
|
|
||
| - name: Comment on PR | ||
| if: always() && github.event_name == 'pull_request' | ||
| continue-on-error: true | ||
| uses: actions/github-script@v7 | ||
| env: | ||
| BASELINE: ${{ steps.baseline.outputs.sha || steps.baseline.outputs.ref }} | ||
| STATUS: ${{ steps.diff.outputs.status }} | ||
| SUMMARY: ${{ steps.diff.outputs.summary }} | ||
| with: | ||
| script: | | ||
| const marker = "<!-- emitter-diff-python -->"; | ||
| const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`; | ||
| const { STATUS, SUMMARY, BASELINE } = process.env; | ||
| let body = `${marker}\n### Python emitter diff\nBaseline \`${BASELINE}\` vs this PR.\n\n`; | ||
| body += STATUS !== "0" | ||
| ? `⚠️ **emitter-diff failed** (exit \`${STATUS}\`) — tool/build error, not a diff. See the [run](${runUrl}).\n` | ||
| : `${SUMMARY}\n\nFull rendered diff: **emitter-diff-html** artifact on the [run](${runUrl}).\n`; | ||
| body += `\n_Informational check (eng/emitter-diff); does not block the PR._`; | ||
| const { owner, repo } = context.repo, issue_number = context.issue.number; | ||
| const { data } = await github.rest.issues.listComments({ owner, repo, issue_number }); | ||
| const hit = data.find((c) => c.body && c.body.includes(marker)); | ||
| if (hit) await github.rest.issues.updateComment({ owner, repo, comment_id: hit.id, body }); | ||
| else await github.rest.issues.createComment({ owner, repo, issue_number, body }); | ||
|
|
||
| - name: Fail on tool error | ||
| if: always() | ||
| env: | ||
| STATUS: ${{ steps.diff.outputs.status }} | ||
| run: '[ "$STATUS" = "0" ] || { echo "::error::emitter-diff failed (exit $STATUS)."; exit 1; }' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,125 @@ | ||
| # emitter-diff | ||
|
|
||
| A language-agnostic tool for **diffing the generated code produced by two versions of a | ||
| TypeSpec emitter**. | ||
|
|
||
| It generates code from the test specs twice (a **baseline** emitter and a **head** emitter), | ||
| then shows the diff between the two outputs. Use it locally during development and in CI on PRs. | ||
|
|
||
| Each language emitter plugs in via a small **adapter** that | ||
| wraps that emitter's own generation command. The core (ref resolution, diffing, orchestration) | ||
| contains no language-specific logic. | ||
|
|
||
| ## How it works | ||
|
|
||
| ```text | ||
| baseline emitter ─┐ | ||
| ├─ generate ─► <work>/baseline ─┐ | ||
| specs ─────────────────────┤ ├─► git diff ─► terminal / HTML | ||
| ├─ generate ─► <work>/head ─────┘ | ||
| head emitter ─────┘ | ||
| ``` | ||
|
|
||
| - The **adapter** wraps the emitter's existing commands. For python that is | ||
| `packages/http-client-python/eng/scripts/ci/regenerate.ts` (generation). | ||
| - The regenerate _driver_ always comes from the current checkout; only the emitter build it points | ||
| at (`--pluginDir`) changes between baseline and head, isolating the diff to emitter behavior. | ||
|
|
||
| ## Usage | ||
|
|
||
| ```bash | ||
| # Run directly with Node 24+ (native TypeScript, no build step, no dependencies): | ||
| node eng/emitter-diff/src/cli.ts --emitter python | ||
| ``` | ||
|
|
||
| > This tool is a set of plain `.ts` scripts — not an installed package. Node 24 runs TypeScript | ||
| > natively, so there is nothing to build or install. Typecheck with `npx tsc -p eng/emitter-diff`. | ||
|
|
||
| ### Refs | ||
|
|
||
| #### Emitter refs (`--baseline`, `--head`) | ||
|
|
||
| | Syntax | Meaning | | ||
| | --------------------------------- | -------------------------------------- | | ||
| | `npm:1.2.3` or `1.2.3` | a published package version (prebuilt) | | ||
| | `local:/path` or `./path` | a local source folder | | ||
| | `github:owner/repo@<sha\|branch>` | a GitHub source at a ref | | ||
| | `gh:<sha\|branch>` | `microsoft/typespec` at a ref | | ||
|
|
||
| `--head` defaults to the **current checkout**. `--baseline` defaults to `gh:upstream/main` when present, otherwise `gh:origin/main` (`origin/HEAD` as a fallback). | ||
|
|
||
| Use `--name` to filter which packages/specs are generated. | ||
|
|
||
| ### Common options | ||
|
|
||
| By default the tool writes a **clickable HTML report** (`emitter-diff.html`) into the work dir and | ||
| prints a `file://` link to it. Use `--html` to write to a specific file. | ||
|
|
||
| - `--name <pattern>`: Filter which specs/packages are generated. | ||
| - `--html <file>`: Write the rendered HTML report to this path (default: `<work-dir>/emitter-diff.html`). | ||
| - `--generated-code-path <path>`: Override the adapter generated-code subpath under each side output root. | ||
| For Python, generated files land under `<side>/<path>/tests/generated/...`. | ||
| - `--fail-on-diff`: Exit non-zero when output differs (CI gating). | ||
|
|
||
| ### Examples | ||
|
|
||
| ```bash | ||
| # Explicit baseline version: | ||
| node eng/emitter-diff/src/cli.ts --emitter python --baseline 0.34.0 --name authentication-api-key | ||
|
|
||
| # Compare two source folders and write an HTML report to a specific path: | ||
| node eng/emitter-diff/src/cli.ts --emitter python --baseline local:/path/to/old/http-client-python \ | ||
| --head local:/path/to/new/http-client-python \ | ||
| --html diff.html | ||
|
|
||
| # Place generated output under a custom subpath for each side (`<side>/generator/tests/generated/...`): | ||
| node eng/emitter-diff/src/cli.ts --emitter python --generated-code-path generator | ||
|
|
||
| # Diff against a GitHub sha: | ||
| node eng/emitter-diff/src/cli.ts --emitter python \ | ||
| --baseline github:microsoft/typespec@<sha> | ||
| ``` | ||
|
|
||
| ## CI integration | ||
|
|
||
| `.github/workflows/ci-emitter-diff-<lang>.yml` runs on PRs that touch the language emitter or this tool. The **baseline** is the base-branch commit the PR is based on (the `git merge-base` with the target branch). CI builds the emitter for both the PR's checkout and a worktree | ||
| of that merge-base commit, diffs the two, and then: | ||
|
|
||
| - posts a **sticky PR comment** (updated in place on each push) with the link to download the diff artifact. | ||
|
|
||
| **Informational:** the check **always passes unless the tool hits a real tool/build error** — a | ||
| generated-output diff does not fail the PR. CI runs the tool without `--fail-on-diff`, so a diff | ||
| still exits `0`; only a non-zero exit (a build/venv/generate failure) fails the job. Reviewers use | ||
| the PR comment and the HTML artifact to eyeball the diff. | ||
|
|
||
| The comment step needs `pull-requests: write`. PRs **from forks** get a read-only token, so the | ||
| comment is best-effort there (`continue-on-error`) — the artifact and job-summary still work. | ||
|
|
||
| ## Adding a new language adapter | ||
|
|
||
| 1. Implement `EmitterAdapter` (`src/types.ts`) — `prepareEmitter` and `generate` — wrapping that | ||
| emitter's own commands (e.g. rust's `npm run tspcompile`, ts's `gen-spector`). | ||
| 2. Register it in `src/registry.ts`. | ||
|
|
||
| That's the only wiring needed — the orchestrator, ref resolver, and diff engine are | ||
| language-agnostic and require no changes. | ||
|
|
||
| ## Notes & limitations | ||
|
|
||
| - For the current python regenerate flow, spec inputs come from each side's | ||
| dependencies (`node_modules/@typespec/http-specs` and | ||
| `node_modules/@azure-tools/azure-http-specs`). Use `--name` to filter. | ||
| - `--html` renders a self-contained, GitHub-style HTML report (inline CSS, no external requests). | ||
| The tool has **zero runtime dependencies** — the diff itself is `git diff --no-index`. | ||
| - For github refs (including fork repos), the resolver uses a detached, commit-keyed cached git | ||
| worktree under the temp directory, created from an isolated cache repo (not from your active | ||
| checkout). That keeps each version's build/venv isolated and allows repeated runs on the same | ||
| commit to reuse existing setup work. | ||
| - **Python's native two-phase pipeline needs a venv per emitter version.** `regenerate.ts` | ||
| compiles TypeSpec to YAML in-process and then runs a batched Python subprocess | ||
| (`eng/scripts/setup/run_batch.py`) that writes the `.py` files using a venv co-located with the | ||
| emitter at `<emitter>/venv`. The adapter therefore builds **and** creates a venv (`npm run | ||
| install`) for each side before generating. Because each version's venv is built from that | ||
| version's generator, a baseline must be a buildable **source** ref (`local`/`github`) or an npm | ||
| version whose package can run `npm run install`; the CI workflow uses a worktree of the PR base | ||
| commit as the baseline. | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could also use worktree for local diff and there are 2 pros:
venvsince worktree is in independent worktree