[Misc] Add coverage report to PRs, including kernels by hughperkins · Pull Request #470 · Genesis-Embodied-AI/quadrants

hughperkins · 2026-04-10T23:07:41Z

Issue: #

Brief Summary

Summary

This PR adds kernel code coverage to quadrants — tracking which lines and branches actually
execute inside compiled kernels on GPU/CPU, not just host-side Python code. The coverage data is
written in standard coverage.py format, so it integrates with pytest-cov, diff-cover, and
IDE coverage viewers out of the box.

What's included

Core feature (_kernel_coverage.py, 275 lines):

AST rewriting inserts lightweight field-store probes into @qd.kernel and @qd.func at each
source line. Probes inside if/else bodies give true runtime branch coverage.
Zero overhead when disabled — the module is never imported unless QD_KERNEL_COVERAGE=1.
Survives qd.init() resets by harvesting probe data before runtime destruction.
Autodiff-safe: probes are excluded from backward/AD replay compilations.
Thread-safe: coverage field is eagerly allocated on the main thread during qd.init().
Pytest plugin (pytest_plugin.py):
Registered via pytest11 entry point. Automatically sets QD_KERNEL_COVERAGE=1 when
pytest-cov is active. Users can opt out with QD_KERNEL_COVERAGE=0.
CI integration (linux.yml, 4_test.sh, 4_test_cuda.sh):
Runs kernel coverage alongside pytest-cov across CPU and CUDA test phases.
Posts diff coverage as PR comments with 80% gate on changed lines.
Merges kernel and Python coverage data into unified reports.
Coverage report tool (coverage_report.py, 458 lines):
Shared by CI and local dev. Supports HTML diff reports, terminal summaries, and PR comments
with collapsible annotated code sections.
Tests (test_kernel_coverage.py, 488 lines):
Covers: basic line/branch coverage, if/else/while/with/try, @qd.func,
multi-kernel, deduplication, reinit survival, autodiff exclusion, env var control,
probe capacity overflow.
Docs (kernel_coverage.md):
User guide covering setup, pytest-cov integration, autodiff behavior, offline cache
interaction, and advanced configuration.

Strengths

Standard format: outputs .coverage files, so the entire Python coverage ecosystem
(pytest-cov, diff-cover, coverage.py, IDE integrations) works without any adapter.
Zero-cost when off: no import, no AST changes, no field allocation unless explicitly enabled.
Automatic with pytest-cov: the pytest plugin means most users get kernel coverage for free
without learning about QD_KERNEL_COVERAGE.
Branch coverage: knows which side of an if/else actually ran on the device, not just
which lines were compiled.
Comprehensive tests: 488 lines of tests covering edge cases (reinit, autodiff, capacity,
xdist, concurrent kernels).

Weaknesses / known limitations

Autodiff validation=True: kernel calls inside qd.ad.Tape(validation=True) are not
covered, because the AD compilation mode skips probe insertion.
Offline cache invalidation: coverage probes change the compiled kernel, so the offline cache
misses on first run with coverage enabled. Expected but worth knowing.
Probe capacity is fixed per-process: defaults to 100k probes. Very large codebases with
many kernels may need to increase QD_COVERAGE_MAX_PROBES.
No inter-line arc data: the probe design knows which lines ran but not the actual
transitions between them, so arc data uses entry/exit arcs per line rather than true
control-flow arcs. This is sufficient for coverage report and diff-cover but won't
give precise branch-pair information.
Large commit count: ~80 commits reflecting iterative CI debugging. Could benefit from a
squash or interactive rebase before merge.

=====================

Here's the PR story:

────────────────────────────────────────

The problem

When you write a quadrants kernel and run your tests with pytest --cov, coverage.py measures which Python lines executed — but it has no visibility into what happens inside
@qd.kernel or @qd.func bodies. Those are compiled to GPU/CPU code and run on the device. From coverage.py's perspective, the kernel call site is one line, and it either ran or
didn't. You can't tell which branches inside the kernel were taken, or whether a particular code path in your kernel is dead.
This PR adds kernel code coverage: tracking which source lines actually execute inside compiled kernels, and feeding that data back into the standard coverage.py ecosystem so
it shows up in coverage report, coverage html, PR comments, etc.

The approach

The key insight is that quadrants already compiles kernels from Python AST → its own IR → device code. We can intercept at the AST stage and insert extra statements — probes —
before each source line. Each probe is just a field store: _qd_cov[probe_id] = 1. The existing compilation pipeline treats this as an ordinary integer write, so it compiles
and runs on the device alongside the user's code with no special C++ support needed.
After the kernel runs, we read the field back to the host. Any probe_id slot that contains 1 tells us that line executed.

The implementation

Enabling: environment variable gate

Everything is gated on QD_KERNEL_COVERAGE=1. Both _func_base.py and kernel.py read this at module load time into a _KERNEL_COVERAGE bool. When it's False, the coverage module
is never imported and there's zero overhead.

_KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"

The AST rewriter (_kernel_coverage.py)

The core of the feature is _CoverageASTRewriter, an ast.NodeTransformer. It walks a kernel's AST and, for each statement at a new source line, prepends a probe node:

_qd_cov[<probe_id>] = 1

It has visit_* methods for every compound statement type (if/else, for, while, with, try/except/finally) so that probes land inside each branch body. This means if you have an
if/else, the if-body probe only fires when that branch is taken — giving true runtime branch coverage, not just line coverage.
Each probe gets a unique integer ID, and the rewriter records a mapping: {probe_id: (filepath, lineno)}. A _seen_lines set prevents duplicate probes when two statements share
a source line (e.g. a = 1; b = 2).
There's a configurable capacity limit (QD_COVERAGE_MAX_PROBES, default 100k) — if exceeded, a warning is emitted and additional lines are silently skipped.

Hooking into compilation (_func_base.py)

When a kernel is about to be compiled, FuncBase._compile_function runs. This is where the AST gets transformed before being handed to the quadrants compiler. The new code
checks two conditions:

_KERNEL_COVERAGE is enabled
The kernel is being compiled in AutodiffMode.NONE (i.e., the normal forward compilation — not an AD replay)

If both are true, it calls _kernel_coverage.rewrite_ast(tree, filepath, start_lineno) to inject probes into the AST. It then retrieves the coverage field via get_field() and
adds it to the kernel's global_vars dict under the name _qd_cov, so the compiled probe statements can resolve the field reference.
The AD exclusion is important: the backward pass is an auto-generated transformation of the kernel. Inserting field stores into it would break gradient computation (they'd be
treated as differentiable operations). Since the backward pass replays the same control flow as the forward pass, there are no user-written lines that appear only in the
backward pass, so nothing is lost.
This also applies to @qd.func bodies — they go through the same _compile_function path and receive probes when called from a non-AD kernel.

Allocating the probe field (kernel.py)

Before compilation, Kernel.materialize_layout calls _kernel_coverage.ensure_field_allocated(). This creates a single global qd.field(dtype=qd.i32, shape=(_MAX_PROBES,)) — a flat
integer array on the device. All probes across all kernels in the process write into this one field, indexed by their probe ID.
The function uses double-checked locking: it checks whether the field already exists (and belongs to the current Program instance), and only allocates under a lock if needed.
This handles the case where qd.init() is called again mid-process, creating a new Program — the field must be re-allocated against the new runtime.

Surviving runtime resets

Users can call qd.reset() / qd.init() to tear down and recreate the runtime. This destroys all fields, including our coverage field. If we don't harvest the probe data before
that happens, it's lost.
The fix is _install_reset_hook(), which monkey-patches PyQuadrants.clear() (the method that destroys the runtime) to call _harvest_field() first. _harvest_field() reads the field
back to numpy, iterates the probe map, and records which lines fired into _accumulated_lines: dict[str, set[int]] — a durable Python dict that survives across runtime resets.

Writing output (flush)

flush() is registered with atexit. It calls _harvest_field() one last time (for the current session), then writes _accumulated_lines to a _qd_kcov. file using coverage.py's
CoverageData API.
It detects whether pytest-cov is using branch (arc) mode by reading the .coverage file. If so, it emits entry/exit arcs (-1, line) and (line, -1) for each covered line. These are
minimal arcs that correctly represent "this line ran" without fabricating transitions between lines that may not reflect actual control flow. If pytest-cov is in line mode, it
writes line data directly.
The PID-based filename ensures parallel pytest-xdist workers don't collide.

Pure kernel exemption (ast_transformer_utils.py)

Quadrants has a "pure kernel" concept (@qd.kernel(fastcache=True)) where the compiler verifies the kernel doesn't access mutable global state. Our _qd_cov field is a global, so
without an exemption, every pure kernel would fail to compile with coverage enabled.
The fix generalizes an existing check: any global variable whose name starts with qd is exempted from the purity violation. This is a one-line change from if name != "_qd_cov"
to if not name.startswith("qd"), keeping it open for future internal globals.

CI integration (linux.yml, 4_test.sh, 4_test_cuda.sh)

The CI workflow runs tests in three stages:

CPU tests (4_test.sh): runs run_tests.py --coverage (which sets QD_KERNEL_COVERAGE=1 and enables pytest-cov), then calls coverage_report.py --collect-only to combine kernel
- Python coverage and emit coverage.xml.
CUDA tests (4_test_cuda.sh): same pattern, on a GPU runner using the wheel built in step 1.
Coverage comment (coverage-comment job): downloads both coverage.xml files, runs coverage_report.py --report-only --format markdown to generate a diff coverage report
against origin/main, and posts it as a PR comment via gh pr comment.

A few existing tests are incompatible with coverage probes (they assert on compiled kernel identity, cache file counts, or thread-safety of add_struct_module). These get
pytest.mark.skipif(QD_KERNEL_COVERAGE) markers and still run in the non-coverage test phases.

Coverage report tool (coverage_report.py)

A standalone script used by both CI and local development. It has two responsibilities:

Collect: rename .coverage to .coverage.pytest, combine with _qd_kcov.* files via coverage combine, generate coverage.xml.
Report: parse coverage.xml for line hits, parse git diff for changed lines, cross-reference to produce a diff coverage report showing which changed lines are covered/missed.

Four output formats are supported via a renderer class hierarchy (_TerminalRenderer, _AnnotatedRenderer, _MarkdownRenderer, _HtmlRenderer), all driven by a single loop in
generate_report that calls begin / begin_file / write_line / end_file / finish.

Documentation (kernel_coverage.md)

A user guide aimed at people using the quadrants library (e.g., Genesis developers), not quadrants framework developers. It covers: enabling coverage, viewing results with
coverage.py and pytest-cov, an example with branch coverage, key properties, probe capacity configuration, autodiff interaction (what is and isn't covered), and a brief "under
the hood" section.

Tests (test_kernel_coverage.py)

The test suite has two layers:
AST rewriter unit tests — parametrized over 6 control-flow constructs (straight-line, if/else, for, while, with, try/except/finally), plus dedicated tests for capacity limiting
and same-line deduplication. These are pure Python and don't need a runtime.
End-to-end tests — run actual kernels on CPU and CUDA and verify probes fired:
• Branch selectivity (taken branch fires, untaken doesn't)
• SIMT with block.sync() and subgroup shuffle (GPU only)
• qd.reset() / qd.init() survival
• Autodiff: forward pass produces probes, backward doesn't add any
• @qd.func called from a kernel
• Multiple kernels accumulating into the same field
• Pure (fastcache) kernel compiles without errors
• _harvest_field graceful failure when to_numpy() throws
copilot:summary

Walkthrough

copilot:walkthrough

Enable pytest-cov in Linux CI and post coverage summary as a PR comment via MishaKav/pytest-coverage-comment. Uses only the built-in GITHUB_TOKEN — no external service or org permissions.

Use diff-cover to report percentage of changed/added lines that are covered by tests. Posts a sticky PR comment with both diff coverage and overall project coverage. No external services needed — uses only the built-in GITHUB_TOKEN.

Fail the Linux CI if less than 80% of changed/added Python lines are covered by tests. The coverage PR comment is posted before the check so numbers are always visible.

Use --format markdown:file.md syntax instead of --format markdown with stdout redirect.

@ti

Code inside @qd.func / @qd.kernel (and @ti. variants) is JIT-compiled to GPU code, so Python coverage.py can never trace it. Exclude these blocks to avoid false-negative coverage on kernel-heavy files.

When QD_KERNEL_COVERAGE=1, rewrite kernel/func Python ASTs to insert coverage probes (field stores) before each statement. The probes execute on the GPU and record which source lines were actually reached, including runtime if/else branches — not just static ones. At process exit, probe data is written to .coverage.kernel which can be merged with pytest-cov data via `coverage combine`. Zero C++ changes. Zero impact on the normal runtime path — the coverage module is only imported when the env var is set.

Track which Program instance the coverage field belongs to. Re-allocate after qd.init() destroys the old SNode tree, preventing dangling field references with garbage dimensions.

start_lineno=10 + relative line 2 - 1 = 11, not 12.

…it() The old flush() tried to read the field at atexit, but by then the runtime was already destroyed (test framework calls qd.reset()), causing to_numpy() to fail silently. Now we harvest probe data into _accumulated_lines whenever ensure_field_allocated() detects a program change, preserving results across reinitializations.

…ction Instead of trying to read the coverage field after the runtime is destroyed (which hangs on CUDA), install a hook on clear() that harvests probe data while the field is still alive. This fixes the hang when switching architectures (e.g. x64 -> cuda) in tests.

run_tests.py --coverage passes --cov-branch to pytest, producing branch/arc coverage data. Our .coverage.kernel was writing line-only data, causing "Can't combine branch coverage data with statement data". Now we detect branch mode and synthesize arcs from covered lines.

_is_branch_coverage() checked config files but --cov-branch is a CLI flag not in config. Now reads the actual .coverage file written by pytest-cov to detect arc mode. Also removes stale .coverage.kernel from previous runs to avoid "no such table: meta" errors.

Uses portable subgroup.shuffle instead of CUDA-specific warp.shfl, and qd.gpu arch so it runs on both CUDA and Vulkan.

The kernel reads flag[0] (a runtime field value) to choose between two shuffle paths, verifying that coverage correctly tracks which branch executed and which didn't.

qd.gpu doesn't match QD_WANTED_ARCHS=cuda filtering in test_utils.

The _qd_cov field is injected into global_vars for coverage instrumentation. Pure kernels flag all global_vars accesses as violations, causing compilation errors. Exempt _qd_cov so coverage works on all kernels including pure ones.

Set QD_KERNEL_COVERAGE=1 in the test script so kernel probes are actually injected during CI runs, and add a coverage combine step to merge .coverage.kernel into the main .coverage before generating reports. Also fix flush() to accumulate kernel data across multiple test phases instead of overwriting.

hughperkins · 2026-04-10T23:08:19Z

 # things, without doing full c++ build
 build-backend = "setuptools.build_meta"

+[tool.coverage.report]


what does this do?

Opus:

That tells coverage.py to treat lines matching those regex patterns as excluded from coverage reporting — as if they had # pragma: no cover.
Specifically, it excludes the decorator lines @qd.func, @qd.kernel, @ti.func, and @ti.kernel. Those lines are never "executed" in the normal Python sense because the decorated
function body gets JIT-compiled rather than interpreted, so without this exclusion they'd show up as uncovered and drag down the coverage percentage.

this seems superfluous now we removed all the exceptions...

Guard the import behind the env var check at call sites so the module is never loaded in normal operation. This lets _kernel_coverage use top-level imports instead of scattered lazy imports.

hughperkins · 2026-04-11T00:00:42Z

 /python/test_env
 /CHANGELOG.md
 /.coverage
+/.coverage.*


hughperkins · 2026-04-11T00:01:15Z

 # Phase 2: install torch, run only torch tests
 pip install torch --index-url https://download.pytorch.org/whl/cpu
-python tests/run_tests.py -v -r 3 -m needs_torch
+python tests/run_tests.py -v -r 3 -m needs_torch --coverage --cov-append


what does --cov-append do?

opus:

--cov-append tells pytest-cov not to erase the existing .coverage data file before the test run. Instead, the new coverage data is appended/merged into whatever's already
there.
By default, pytest-cov wipes .coverage at the start of each run. With --cov-append, you can run multiple test sessions and accumulate coverage across all of them into a single
combined report.
Typical use case: running tests in separate phases (e.g., field backend then ndarray backend) and merging the results into one coverage report at the end.

hughperkins · 2026-04-23T20:43:49Z

@claude review

claude

⚠️ Code review skipped — your organization's overage spend limit has been reached.

Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit at claude.ai/admin-settings/claude-code.

Once credits are available, comment @claude review on this pull request to trigger a review.

hughperkins · 2026-04-23T21:37:11Z

@claude review

github-actions · 2026-04-23T21:47:26Z

Coverage Report (`49f8dea62`)

Metric	Value
Diff coverage (changed lines only)	84%
Overall project coverage	72%

Total: 444 lines, 71 missing, 84% covered

🟢 python/quadrants/lang/_func_base.py (82%)

🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field

🟢 python/quadrants/lang/_kernel_coverage.py (80%)

      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov wrote branch (arc) data by reading .coverage.
    129  
    130      Defaults to True (arc mode) when .coverage doesn't exist or is empty, since run_tests.py --coverage always
    131      enables --cov-branch.
    132      """
🔴  133      try:
🔴  134          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  135          cd = CoverageData(basename=cov_path)
🔴  136          cd.read()
🔴  137          if not cd.measured_files():
🔴  138              return True
🔴  139          return cd.has_arcs()
🔴  140      except Exception:
🔴  141          logging.debug("Failed to detect arc mode from .coverage file, defaulting to arc mode", exc_info=True)
🔴  142          return True
    143  
    144  
🟢  145  def flush() -> None:
    146      """Harvest any remaining field data and write all results to a .coverage file.
    147  
    148      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    149      is lost across multiple invocations.
    150      """
🔴  151      _harvest_field()
    152  
🔴  153      with _lock:
🔴  154          if not _accumulated_lines:
🔴  155              return
🔴  156          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    157  
🔴  158      base_dir = _coverage_dir or os.getcwd()
🔴  159      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  160      use_arcs = _detect_arc_mode()
    161  
🔴  162      cov = CoverageData(basename=kernel_path)
🔴  163      if use_arcs:
🔴  164          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  165          for filepath, lines in snapshot.items():
    166              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    167              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  168              arcs = []
🔴  169              for line in sorted(lines):
🔴  170                  arcs.append((-1, line))
🔴  171                  arcs.append((line, -1))
🔴  172              arcs_by_file[filepath] = arcs
🔴  173          cov.add_arcs(arcs_by_file)
    174      else:
🔴  175          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  176      cov.write()
    177  
    178  
🟢  179  _capacity_warning_emitted = False
    180  
    181  
🟢  182  class _CoverageASTRewriter(ast.NodeTransformer):
    183      """Insert coverage probes before each statement at a new source line."""
    184  
🟢  185      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  186          self._field_name = field_name
🟢  187          self._filepath = filepath
🟢  188          self._start_lineno = start_lineno
🟢  189          self.next_probe_id = probe_id_start
🟢  190          self._seen_lines: set[int] = set()
🟢  191          self.probe_map: dict[int, tuple[str, int]] = {}
    192  
🟢  193      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    194          global _capacity_warning_emitted
🟢  195          probe_id = self.next_probe_id
🟢  196          if probe_id >= _MAX_PROBES:
🟢  197              if not _capacity_warning_emitted:
🟢  198                  warnings.warn(
    199                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    200                      f"Additional kernel lines will not be tracked. "
    201                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    202                      stacklevel=2,
    203                  )
🟢  204                  _capacity_warning_emitted = True
🟢  205              return None
🟢  206          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  207          self.next_probe_id += 1
🟢  208          node = ast.Assign(
    209              targets=[
    210                  ast.Subscript(
    211                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    212                      slice=ast.Constant(value=probe_id),
    213                      ctx=ast.Store(),
    214                  )
    215              ],
    216              value=ast.Constant(value=1),
    217              lineno=rel_lineno,
    218              col_offset=col_offset,
    219              end_lineno=rel_lineno,
    220              end_col_offset=col_offset,
    221          )
🟢  222          return node
    223  
🟢  224      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  225          result: list[ast.stmt] = []
🟢  226          for stmt in stmts:
🟢  227              rel_lineno = getattr(stmt, "lineno", None)
🟢  228              if rel_lineno is not None:
🟢  229                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  230                  if abs_lineno not in self._seen_lines:
🟢  231                      self._seen_lines.add(abs_lineno)
🟢  232                      col = getattr(stmt, "col_offset", 0)
🟢  233                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  234                      if probe is not None:
🟢  235                          result.append(probe)
🟢  236              result.append(self.visit(stmt))
🟢  237          return result
    238  
🟢  239      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  240          node.body = self._instrument_body(node.body)
🟢  241          return node
    242  
🟢  243      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  244          node.body = self._instrument_body(node.body)
🔴  245          return node
    246  
🟢  247      def visit_If(self, node: ast.If) -> ast.If:
🟢  248          node.body = self._instrument_body(node.body)
🟢  249          if node.orelse:
🟢  250              node.orelse = self._instrument_body(node.orelse)
🟢  251          return node
    252  
🟢  253      def visit_For(self, node: ast.For) -> ast.For:
🟢  254          node.body = self._instrument_body(node.body)
🟢  255          if node.orelse:
🟢  256              node.orelse = self._instrument_body(node.orelse)
🟢  257          return node
    258  
🟢  259      def visit_While(self, node: ast.While) -> ast.While:
🟢  260          node.body = self._instrument_body(node.body)
🟢  261          if node.orelse:
🟢  262              node.orelse = self._instrument_body(node.orelse)
🟢  263          return node
    264  
🟢  265      def visit_With(self, node: ast.With) -> ast.With:
🟢  266          node.body = self._instrument_body(node.body)
🟢  267          return node
    268  
🟢  269      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          for handler in node.handlers:
🟢  272              handler.body = self._instrument_body(handler.body)
🟢  273          if node.orelse:
🟢  274              node.orelse = self._instrument_body(node.orelse)
🟢  275          if node.finalbody:
🟢  276              node.finalbody = self._instrument_body(node.finalbody)
🟢  277          return node
    278  
    279  
🟢  280  atexit.register(flush)

🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)

🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True

🟢 python/quadrants/lang/kernel.py (80%)

     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386

🟢 python/quadrants/lang/misc.py (100%)

    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501

🔴 python/quadrants/pytest_plugin.py (50%)

      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if config.pluginmanager.hasplugin("_cov"):
🟢   12          os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
     13          # Kernel coverage always writes arc-format data; ensure pytest-cov matches to avoid
     14          # "Can not mix line and arc data" errors during coverage combine.
🟢   15          if not config.option.__dict__.get("cov_branch", False):
🔴   16              config.option.cov_branch = True

🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)

     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )

🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)

🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )

🟢 tests/python/test_api.py (100%)

🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])

🟢 tests/python/test_intrinsics.py (100%)

     55      (thread i does (i+1)*200000). Asserts strict monotonicity across threads and that
🟢   91          assert a[i - 1] < a[i] < a[i + 1]

🟢 tests/python/test_kernel_coverage.py (87%)

      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🔴  246      from quadrants.lang import _kernel_coverage
🔴  247      from quadrants.lang.simt import subgroup
    248  
🔴  249      _kernel_coverage.ensure_field_allocated()
    250  
🔴  251      N = 64
🔴  252      probe_count_before = _kernel_coverage._probe_counter
🔴  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🔴  254      a = qd.field(dtype=qd.i32, shape=(N,))
🔴  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🔴  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🔴  259      @qd.kernel
🔴  260      def simt_kernel():
🔴  261          qd.loop_config(block_dim=N)
🔴  262          for i in range(N):
🔴  263              a[i] = i + 1
🔴  264              qd.simt.block.sync()
🔴  265              if flag[0] > 0:
🔴  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🔴  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🔴  272      simt_kernel()
    273  
🔴  274      for i in range(4):
🔴  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🔴  277      cov_field = _kernel_coverage.get_field()
🔴  278      arr = cov_field.to_numpy()
    279  
🔴  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🔴  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🔴  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🔴  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🔴  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42

🟢 tests/python/test_offline_cache.py (100%)

     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22

🟢 tests/python/test_snode_layout_inspection.py (100%)

🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

run_tests.py auto-prepends "test_" to filenames that don't start with it, so passing the full path "tests/python/test_kernel_coverage.py" was mangled to "test_tests/python/test_kernel_coverage.py". Made-with: Cursor

Without this, toggling coverage between runs serves a stale cached kernel: coverage-off cached kernel gives 0% coverage when re-enabled, and coverage-on cached kernel causes NameError on _qd_cov when disabled.

The previous approach of setting config.option.cov_branch in pytest_configure was too late -- pytest-cov had already initialized its Coverage object in an earlier hook. Instead, the plugin now reads the cov_branch option and sets _QD_KCOV_ARC env var, which _detect_arc_mode() checks first. This avoids "Can not mix line and arc data" errors when users run pytest --cov without --cov-branch. Also change the default from arc to line mode when nothing is known, since pytest --cov without --cov-branch is more common.

hughperkins · 2026-04-23T22:14:23Z

@claude review

claude · 2026-04-23T22:59:09Z

+def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
+    """Rewrite a kernel/func AST to insert coverage probes.
+
+    Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
+    Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
+    """
+    global _probe_counter
+    with _lock:
+        rewriter = _CoverageASTRewriter(
+            field_name=FIELD_VAR_NAME,
+            filepath=filepath,
+            start_lineno=start_lineno,
+            probe_id_start=_probe_counter,
+        )
+        tree = rewriter.visit(tree)
+        ast.fix_missing_locations(tree)
+        _probe_counter = rewriter.next_probe_id
+        _probe_map.update(rewriter.probe_map)
+    return tree


🔴 Cached kernel IR includes baked-in _qd_cov[N] = 1 stores where N comes from the per-process monotonic _probe_counter, so probe_ids are not stable across processes. The new kcov cache-key partition in src_hasher.create_cache_key (src_hasher.py:53) correctly separates on/off, but within kcov=on two processes with identical source produce IR whose literal probe_ids reflect the first process's compilation order; when Process B loads Process A's cached IR with @qd.kernel(fastcache=True) + offline_cache=True, launches write to Process B's _probe_map[0..N] slots that may belong to a completely different kernel — producing silent false positives (another kernel's lines marked covered) and false negatives (the running kernel's lines marked uncovered). The same mechanism means QD_COVERAGE_MAX_PROBES is also absent from the cache key, so shrinking the cap between runs can load IR whose baked-in probe_id exceeds the new field size, producing OOB device writes. Fix: derive probe_ids deterministically from hash((filepath, abs_lineno)) with a module-level (filepath, lineno) -> probe_id dedup dict, and append str(_MAX_PROBES) to the hash_iterable_strings tuple. CI misses this because test_offline_cache.py is skipped under QD_KERNEL_COVERAGE=1.

Extended reasoning...

What the bug is and how it manifests

_CoverageASTRewriter._make_probe (python/quadrants/lang/_kernel_coverage.py:193-222) produces an ast.Assign node whose subscript is ast.Constant(value=probe_id). The compiled kernel IR therefore contains the literal probe_id value from whichever process first rewrote the AST. Probe IDs come from the module-level _probe_counter (lines 33, 112-123), which is per-process and monotonically increasing: the same kernel gets different probe_ids depending on which kernels materialized first in this process.

On top of that, src_hasher.create_cache_key (src_hasher.py:44-55) now appends "kcov" when QD_KERNEL_COVERAGE=1, correctly partitioning the cache between coverage-on and coverage-off — but the tuple contains no component reflecting the current probe_counter state, _probe_map contents, or _MAX_PROBES. Two processes with identical source + same kcov flag produce the same fastcache key, so Process B can load an IR that was compiled by Process A with Process A-specific probe_id literals.

The specific code path that triggers it

Process A compiles kernels K1, K2 in that order with QD_KERNEL_COVERAGE=1. rewrite_ast assigns probe_ids [0, 1, 2] to K1 and [3, 4, 5] to K2. The C++ IR cache (via offline_cache + offline_cache_file_path) stores the compiled kernels keyed by fastcache key under kernel_compilation_manager.cpp:277-280 (MemAndDiskCache mode). Python-side src_hasher.store writes (src_key → frontend_cache_key).

Process B starts with the same sources. A test happens to materialize K2 first (e.g. different pytest-xdist sharding, different test selection). rewrite_ast assigns probe_ids [0, 1, 2] to K2 and [3, 4, 5] to K1 in Process B's _probe_map.

Kernel.materialize() calls _try_load_fastcache (kernel.py:335-373) for K1. create_cache_key produces the same key as Process A (since kcov marker, source, args, config all match). load returns Process A's frontend_cache_key, and prog.load_fast_cache fetches the IR compiled in Process A — with _qd_cov[0] = 1, _qd_cov[1] = 1, _qd_cov[2] = 1 baked in.

materialize still calls get_tree_and_ctx, which at _func_base.py:249-257 unconditionally calls _kcov.rewrite_ast and bumps _probe_counter / _probe_map based on Process B's state. But at kernel.py:420, only_parse_function_def=self.compiled_kernel_data_by_key.get(key) is not None is True on a cache hit, and function_def_transformer.py returns before processing the body — the freshly rewritten AST is discarded and the cached IR is what runs at launch.

At K1's launch in Process B, the cached IR writes _qd_cov[0] = 1, _qd_cov[1] = 1, _qd_cov[2] = 1. In Process B's _probe_map, indices [0, 1, 2] point to K2's source lines, not K1's.

At harvest: arr[0..2] are nonzero → K2's lines marked covered even though K2 may not have run (false positives). arr[3..5] are zero → K1's lines marked uncovered even though K1 actually ran (false negatives).

Step-by-step proof with a concrete example

Suppose the codebase has two kernels, K1 (3 instrumented lines at file.py:10-12) and K2 (3 instrumented lines at file.py:20-22):

Process Compile order probe_ids for K1 probe_ids for K2 _probe_map

A K1, K2 0, 1, 2 3, 4, 5 0→L10, 1→L11, 2→L12, 3→L20, 4→L21, 5→L22

B K2, K1 3, 4, 5 0, 1, 2 0→L20, 1→L21, 2→L22, 3→L10, 4→L11, 5→L12

Process A writes the offline cache. Process B opens it, and when K1 is materialized:

create_cache_key(K1) → same key as Process A → load returns Process A's frontend_cache_key → load_fast_cache fetches IR with _qd_cov[0] = 1, _qd_cov[1] = 1, _qd_cov[2] = 1 baked in.

only_parse_function_def=True at kernel.py:420 discards Process B's rewritten AST (which would have written to 3/4/5).

Launching K1 fires slots 0, 1, 2 in Process B's field.

Harvest: _probe_map[0..2] → L20, L21, L22 (K2's lines) marked covered. K2 may never have executed in Process B — false positive. _probe_map[3..5] → L10, L11, L12 (K1's lines) are all zero — false negative even though K1 did run.

Why existing code does not prevent it

The PR added "kcov" to create_cache_key, demonstrating awareness of coverage-related cache partitioning, but only handles the on/off dimension. The probe_id-baked-in problem operates entirely within the kcov=on bucket.

rewrite_ast runs unconditionally in _func_base.py:254 — there is no check for whether the subsequent fastcache lookup will hit and discard its output. So _probe_counter advances and _probe_map is populated per-process regardless of what IR actually runs.

only_parse_function_def=True is by design: the PR's explicit support for @qd.kernel(fastcache=True) + coverage (via the _qd_ prefix exemption in ast_transformer_utils.py:335) means cached pure kernels are expected to carry _qd_cov writes, but nothing keeps those writes consistent with the loading process's _probe_map.

CI does not catch this: tests/python/test_offline_cache.py was skipped under QD_KERNEL_COVERAGE=1 in this same PR, and test_src_ll_cache.py was skipped for similar reasons — so no CI test exercises the kcov + offline_cache + cross-process combination.

What the impact would be

Silent correctness failure of coverage data when a user combines QD_KERNEL_COVERAGE=1, offline_cache=True, a persistent offline_cache_file_path, and @qd.kernel(fastcache=True) across multiple processes (pytest-xdist workers, sequential CI runs, or just two developers running the same test suite). Compilation order differences — which are common due to test parallelism, test discovery order, fixture setup, selective test runs — cause coverage reports with both false positives and false negatives. Nothing warns the user. The documented "automatic with pytest-cov" path explicitly opts into this combination.

Separately but relatedly, because QD_COVERAGE_MAX_PROBES is also not in the cache key, shrinking it between runs can cause a cached IR with baked-in _qd_cov[5000] = 1 to be launched against a freshly allocated (500,)-shaped field — a device-side out-of-bounds write (silent memory corruption on most backends). Narrower scenario, but the fix is the same one-line cache-key addition.

How to fix it

Two complementary changes in _kernel_coverage.py and src_hasher.py:

Make probe_ids deterministic across processes. Replace the monotonic counter with a stable hash:

_probe_id_cache: dict[tuple[str, int], int] = {} # (filepath, abs_lineno) -> probe_id _collision_offset = 0 def _probe_id_for(filepath: str, abs_lineno: int) -> int | None: key = (filepath, abs_lineno) with _lock: if key in _probe_id_cache: return _probe_id_cache[key] # derive from hash; resolve collisions within existing dict candidate = hash(key) % _MAX_PROBES while candidate in _probe_map and _probe_map[candidate] != key: candidate = (candidate + 1) % _MAX_PROBES if len(_probe_id_cache) >= _MAX_PROBES: return None # full _probe_id_cache[key] = candidate _probe_map[candidate] = key return candidate

This makes the IR byte-identical across processes for the same source, so the cached IR's _qd_cov[N] = 1 writes land in slots whose _probe_map entries are the same in every process.

Include _MAX_PROBES in the fastcache key. In src_hasher.create_cache_key, append str(_MAX_PROBES) (or os.environ.get("QD_COVERAGE_MAX_PROBES", "100000")) to the hash_iterable_strings tuple. This prevents cache reuse across runs with different probe caps.

Additionally, test_offline_cache.py should have a non-skipped variant (or a dedicated test) that exercises the kcov=on + offline_cache path across two subprocesses with different compile orders, to catch future regressions.

github-actions · 2026-04-23T23:10:43Z

Coverage Report (`6081d31d0`)

Metric	Value
Diff coverage (changed lines only)	90%
Overall project coverage	72%

Total: 451 lines, 47 missing, 90% covered

🔴 python/quadrants/lang/_fast_caching/src_hasher.py (0%)

🔴    2  import os
     53              "kcov" if os.environ.get("QD_KERNEL_COVERAGE") == "1" else "",

🟢 python/quadrants/lang/_func_base.py (82%)

🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field

🔴 python/quadrants/lang/_kernel_coverage.py (79%)

      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov is running in branch (arc) mode.
    129  
    130      Checks _QD_KCOV_ARC env var first (set by the pytest plugin), then falls back to reading .coverage.
    131      Defaults to False (line mode) when nothing is known, since ``pytest --cov`` without ``--cov-branch``
    132      is the more common invocation.
    133      """
🔴  134      arc_env = os.environ.get("_QD_KCOV_ARC")
🔴  135      if arc_env is not None:
🔴  136          return arc_env == "1"
🔴  137      try:
🔴  138          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  139          cd = CoverageData(basename=cov_path)
🔴  140          cd.read()
🔴  141          if not cd.measured_files():
🔴  142              return False
🔴  143          return cd.has_arcs()
🔴  144      except Exception:
🔴  145          logging.debug("Failed to detect arc mode from .coverage file, defaulting to line mode", exc_info=True)
🔴  146          return False
    147  
    148  
🟢  149  def flush() -> None:
    150      """Harvest any remaining field data and write all results to a .coverage file.
    151  
    152      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    153      is lost across multiple invocations.
    154      """
🔴  155      _harvest_field()
    156  
🔴  157      with _lock:
🔴  158          if not _accumulated_lines:
🔴  159              return
🔴  160          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    161  
🔴  162      base_dir = _coverage_dir or os.getcwd()
🔴  163      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  164      use_arcs = _detect_arc_mode()
    165  
🔴  166      cov = CoverageData(basename=kernel_path)
🔴  167      if use_arcs:
🔴  168          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  169          for filepath, lines in snapshot.items():
    170              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    171              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  172              arcs = []
🔴  173              for line in sorted(lines):
🔴  174                  arcs.append((-1, line))
🔴  175                  arcs.append((line, -1))
🔴  176              arcs_by_file[filepath] = arcs
🔴  177          cov.add_arcs(arcs_by_file)
    178      else:
🔴  179          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  180      cov.write()
    181  
    182  
🟢  183  _capacity_warning_emitted = False
    184  
    185  
🟢  186  class _CoverageASTRewriter(ast.NodeTransformer):
    187      """Insert coverage probes before each statement at a new source line."""
    188  
🟢  189      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  190          self._field_name = field_name
🟢  191          self._filepath = filepath
🟢  192          self._start_lineno = start_lineno
🟢  193          self.next_probe_id = probe_id_start
🟢  194          self._seen_lines: set[int] = set()
🟢  195          self.probe_map: dict[int, tuple[str, int]] = {}
    196  
🟢  197      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    198          global _capacity_warning_emitted
🟢  199          probe_id = self.next_probe_id
🟢  200          if probe_id >= _MAX_PROBES:
🟢  201              if not _capacity_warning_emitted:
🟢  202                  warnings.warn(
    203                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    204                      f"Additional kernel lines will not be tracked. "
    205                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    206                      stacklevel=2,
    207                  )
🟢  208                  _capacity_warning_emitted = True
🟢  209              return None
🟢  210          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  211          self.next_probe_id += 1
🟢  212          node = ast.Assign(
    213              targets=[
    214                  ast.Subscript(
    215                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    216                      slice=ast.Constant(value=probe_id),
    217                      ctx=ast.Store(),
    218                  )
    219              ],
    220              value=ast.Constant(value=1),
    221              lineno=rel_lineno,
    222              col_offset=col_offset,
    223              end_lineno=rel_lineno,
    224              end_col_offset=col_offset,
    225          )
🟢  226          return node
    227  
🟢  228      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  229          result: list[ast.stmt] = []
🟢  230          for stmt in stmts:
🟢  231              rel_lineno = getattr(stmt, "lineno", None)
🟢  232              if rel_lineno is not None:
🟢  233                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  234                  if abs_lineno not in self._seen_lines:
🟢  235                      self._seen_lines.add(abs_lineno)
🟢  236                      col = getattr(stmt, "col_offset", 0)
🟢  237                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  238                      if probe is not None:
🟢  239                          result.append(probe)
🟢  240              result.append(self.visit(stmt))
🟢  241          return result
    242  
🟢  243      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  244          node.body = self._instrument_body(node.body)
🟢  245          return node
    246  
🟢  247      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  248          node.body = self._instrument_body(node.body)
🔴  249          return node
    250  
🟢  251      def visit_If(self, node: ast.If) -> ast.If:
🟢  252          node.body = self._instrument_body(node.body)
🟢  253          if node.orelse:
🟢  254              node.orelse = self._instrument_body(node.orelse)
🟢  255          return node
    256  
🟢  257      def visit_For(self, node: ast.For) -> ast.For:
🟢  258          node.body = self._instrument_body(node.body)
🟢  259          if node.orelse:
🟢  260              node.orelse = self._instrument_body(node.orelse)
🟢  261          return node
    262  
🟢  263      def visit_While(self, node: ast.While) -> ast.While:
🟢  264          node.body = self._instrument_body(node.body)
🟢  265          if node.orelse:
🟢  266              node.orelse = self._instrument_body(node.orelse)
🟢  267          return node
    268  
🟢  269      def visit_With(self, node: ast.With) -> ast.With:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          return node
    272  
🟢  273      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  274          node.body = self._instrument_body(node.body)
🟢  275          for handler in node.handlers:
🟢  276              handler.body = self._instrument_body(handler.body)
🟢  277          if node.orelse:
🟢  278              node.orelse = self._instrument_body(node.orelse)
🟢  279          if node.finalbody:
🟢  280              node.finalbody = self._instrument_body(node.finalbody)
🟢  281          return node
    282  
    283  
🟢  284  atexit.register(flush)

🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)

🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True

🟢 python/quadrants/lang/kernel.py (80%)

     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386

🟢 python/quadrants/lang/misc.py (100%)

    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501

🔴 python/quadrants/pytest_plugin.py (67%)

      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if not config.pluginmanager.hasplugin("_cov"):
🔴   12          return
🟢   13      os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
🟢   14      if os.environ.get("QD_KERNEL_COVERAGE") != "1":
🟢   15          return
     16      # Tell the kernel coverage module whether pytest-cov is running in branch (arc) mode,
     17      # so it writes the matching format and avoids "Can not mix line and arc data" at combine time.
     18      # We read config.option.cov_branch which pytest-cov has already populated by this point.
🟢   19      cov_branch = getattr(config.option, "cov_branch", False) or False
🟢   20      os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"

🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)

     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )

🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)

🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )

🟢 tests/python/test_api.py (100%)

🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])

🟢 tests/python/test_intrinsics.py (100%)

     55      (thread i does (i+1)*200000). Asserts strict monotonicity across threads and that
🟢   91          assert a[i - 1] < a[i] < a[i + 1]

🟢 tests/python/test_kernel_coverage.py (99%)

      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🟢  246      from quadrants.lang import _kernel_coverage
🟢  247      from quadrants.lang.simt import subgroup
    248  
🟢  249      _kernel_coverage.ensure_field_allocated()
    250  
🟢  251      N = 64
🟢  252      probe_count_before = _kernel_coverage._probe_counter
🟢  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🟢  254      a = qd.field(dtype=qd.i32, shape=(N,))
🟢  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🟢  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🟢  259      @qd.kernel
🟢  260      def simt_kernel():
🟢  261          qd.loop_config(block_dim=N)
🟢  262          for i in range(N):
🟢  263              a[i] = i + 1
🟢  264              qd.simt.block.sync()
🟢  265              if flag[0] > 0:
🟢  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🟢  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🟢  272      simt_kernel()
    273  
🟢  274      for i in range(4):
🟢  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🟢  277      cov_field = _kernel_coverage.get_field()
🟢  278      arr = cov_field.to_numpy()
    279  
🟢  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🟢  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🟢  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🟢  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🟢  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42

🟢 tests/python/test_offline_cache.py (100%)

     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22

🟢 tests/python/test_snode_layout_inspection.py (100%)

🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

hughperkins · 2026-04-23T23:12:40Z

Update from Opus:

One new comment (59 → 60). It's a repeat of earlier comment #38 -- three e2e tests dereference get_field() without checking for None first, while the other three tests in the
same file do have the guard.

This is a minor code quality issue (yellow severity). The tests would give an opaque AttributeError instead of a clear AssertionError if get_field() returned None. Want me to
add the guards?

so basically, no need to do anything, and good to merge now?

Yes, that's my read. The remaining comment is a minor style consistency issue (adding assert cov_field is not None to 3 tests that already work fine). Not a bug, not a
correctness issue -- just a slightly better error message in a failure mode that doesn't happen in practice.

Good to merge.

…4_test.sh) Made-with: Cursor

github-actions · 2026-04-24T00:43:16Z

Coverage Report (`50e1f8ad6`)

Metric	Value
Diff coverage (changed lines only)	90%
Overall project coverage	72%

Total: 451 lines, 47 missing, 90% covered

🔴 python/quadrants/lang/_fast_caching/src_hasher.py (0%)

🔴    2  import os
     53              "kcov" if os.environ.get("QD_KERNEL_COVERAGE") == "1" else "",

🟢 python/quadrants/lang/_func_base.py (82%)

🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field

🔴 python/quadrants/lang/_kernel_coverage.py (79%)

      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov is running in branch (arc) mode.
    129  
    130      Checks _QD_KCOV_ARC env var first (set by the pytest plugin), then falls back to reading .coverage.
    131      Defaults to False (line mode) when nothing is known, since ``pytest --cov`` without ``--cov-branch``
    132      is the more common invocation.
    133      """
🔴  134      arc_env = os.environ.get("_QD_KCOV_ARC")
🔴  135      if arc_env is not None:
🔴  136          return arc_env == "1"
🔴  137      try:
🔴  138          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  139          cd = CoverageData(basename=cov_path)
🔴  140          cd.read()
🔴  141          if not cd.measured_files():
🔴  142              return False
🔴  143          return cd.has_arcs()
🔴  144      except Exception:
🔴  145          logging.debug("Failed to detect arc mode from .coverage file, defaulting to line mode", exc_info=True)
🔴  146          return False
    147  
    148  
🟢  149  def flush() -> None:
    150      """Harvest any remaining field data and write all results to a .coverage file.
    151  
    152      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    153      is lost across multiple invocations.
    154      """
🔴  155      _harvest_field()
    156  
🔴  157      with _lock:
🔴  158          if not _accumulated_lines:
🔴  159              return
🔴  160          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    161  
🔴  162      base_dir = _coverage_dir or os.getcwd()
🔴  163      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  164      use_arcs = _detect_arc_mode()
    165  
🔴  166      cov = CoverageData(basename=kernel_path)
🔴  167      if use_arcs:
🔴  168          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  169          for filepath, lines in snapshot.items():
    170              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    171              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  172              arcs = []
🔴  173              for line in sorted(lines):
🔴  174                  arcs.append((-1, line))
🔴  175                  arcs.append((line, -1))
🔴  176              arcs_by_file[filepath] = arcs
🔴  177          cov.add_arcs(arcs_by_file)
    178      else:
🔴  179          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  180      cov.write()
    181  
    182  
🟢  183  _capacity_warning_emitted = False
    184  
    185  
🟢  186  class _CoverageASTRewriter(ast.NodeTransformer):
    187      """Insert coverage probes before each statement at a new source line."""
    188  
🟢  189      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  190          self._field_name = field_name
🟢  191          self._filepath = filepath
🟢  192          self._start_lineno = start_lineno
🟢  193          self.next_probe_id = probe_id_start
🟢  194          self._seen_lines: set[int] = set()
🟢  195          self.probe_map: dict[int, tuple[str, int]] = {}
    196  
🟢  197      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    198          global _capacity_warning_emitted
🟢  199          probe_id = self.next_probe_id
🟢  200          if probe_id >= _MAX_PROBES:
🟢  201              if not _capacity_warning_emitted:
🟢  202                  warnings.warn(
    203                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    204                      f"Additional kernel lines will not be tracked. "
    205                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    206                      stacklevel=2,
    207                  )
🟢  208                  _capacity_warning_emitted = True
🟢  209              return None
🟢  210          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  211          self.next_probe_id += 1
🟢  212          node = ast.Assign(
    213              targets=[
    214                  ast.Subscript(
    215                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    216                      slice=ast.Constant(value=probe_id),
    217                      ctx=ast.Store(),
    218                  )
    219              ],
    220              value=ast.Constant(value=1),
    221              lineno=rel_lineno,
    222              col_offset=col_offset,
    223              end_lineno=rel_lineno,
    224              end_col_offset=col_offset,
    225          )
🟢  226          return node
    227  
🟢  228      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  229          result: list[ast.stmt] = []
🟢  230          for stmt in stmts:
🟢  231              rel_lineno = getattr(stmt, "lineno", None)
🟢  232              if rel_lineno is not None:
🟢  233                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  234                  if abs_lineno not in self._seen_lines:
🟢  235                      self._seen_lines.add(abs_lineno)
🟢  236                      col = getattr(stmt, "col_offset", 0)
🟢  237                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  238                      if probe is not None:
🟢  239                          result.append(probe)
🟢  240              result.append(self.visit(stmt))
🟢  241          return result
    242  
🟢  243      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  244          node.body = self._instrument_body(node.body)
🟢  245          return node
    246  
🟢  247      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  248          node.body = self._instrument_body(node.body)
🔴  249          return node
    250  
🟢  251      def visit_If(self, node: ast.If) -> ast.If:
🟢  252          node.body = self._instrument_body(node.body)
🟢  253          if node.orelse:
🟢  254              node.orelse = self._instrument_body(node.orelse)
🟢  255          return node
    256  
🟢  257      def visit_For(self, node: ast.For) -> ast.For:
🟢  258          node.body = self._instrument_body(node.body)
🟢  259          if node.orelse:
🟢  260              node.orelse = self._instrument_body(node.orelse)
🟢  261          return node
    262  
🟢  263      def visit_While(self, node: ast.While) -> ast.While:
🟢  264          node.body = self._instrument_body(node.body)
🟢  265          if node.orelse:
🟢  266              node.orelse = self._instrument_body(node.orelse)
🟢  267          return node
    268  
🟢  269      def visit_With(self, node: ast.With) -> ast.With:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          return node
    272  
🟢  273      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  274          node.body = self._instrument_body(node.body)
🟢  275          for handler in node.handlers:
🟢  276              handler.body = self._instrument_body(handler.body)
🟢  277          if node.orelse:
🟢  278              node.orelse = self._instrument_body(node.orelse)
🟢  279          if node.finalbody:
🟢  280              node.finalbody = self._instrument_body(node.finalbody)
🟢  281          return node
    282  
    283  
🟢  284  atexit.register(flush)

🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)

🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True

🟢 python/quadrants/lang/kernel.py (80%)

     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386

🟢 python/quadrants/lang/misc.py (100%)

    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501

🔴 python/quadrants/pytest_plugin.py (67%)

      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if not config.pluginmanager.hasplugin("_cov"):
🔴   12          return
🟢   13      os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
🟢   14      if os.environ.get("QD_KERNEL_COVERAGE") != "1":
🟢   15          return
     16      # Tell the kernel coverage module whether pytest-cov is running in branch (arc) mode,
     17      # so it writes the matching format and avoids "Can not mix line and arc data" at combine time.
     18      # We read config.option.cov_branch which pytest-cov has already populated by this point.
🟢   19      cov_branch = getattr(config.option, "cov_branch", False) or False
🟢   20      os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"

🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)

     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )

🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)

🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )

🟢 tests/python/test_api.py (100%)

🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])

🟢 tests/python/test_intrinsics.py (100%)

     55      (thread i does (i+1)*200000). Asserts strict monotonicity across threads and that
🟢   91          assert a[i - 1] < a[i] < a[i + 1]

🟢 tests/python/test_kernel_coverage.py (99%)

      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🟢  246      from quadrants.lang import _kernel_coverage
🟢  247      from quadrants.lang.simt import subgroup
    248  
🟢  249      _kernel_coverage.ensure_field_allocated()
    250  
🟢  251      N = 64
🟢  252      probe_count_before = _kernel_coverage._probe_counter
🟢  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🟢  254      a = qd.field(dtype=qd.i32, shape=(N,))
🟢  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🟢  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🟢  259      @qd.kernel
🟢  260      def simt_kernel():
🟢  261          qd.loop_config(block_dim=N)
🟢  262          for i in range(N):
🟢  263              a[i] = i + 1
🟢  264              qd.simt.block.sync()
🟢  265              if flag[0] > 0:
🟢  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🟢  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🟢  272      simt_kernel()
    273  
🟢  274      for i in range(4):
🟢  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🟢  277      cov_field = _kernel_coverage.get_field()
🟢  278      arr = cov_field.to_numpy()
    279  
🟢  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🟢  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🟢  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🟢  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🟢  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42

🟢 tests/python/test_offline_cache.py (100%)

     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22

🟢 tests/python/test_snode_layout_inspection.py (100%)

🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

erizmr

Approved as discussed.

duburcqa · 2026-04-24T21:47:13Z

claude · 2026-04-24T21:59:28Z

+    cov_branch = getattr(config.option, "cov_branch", False) or False
+    os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"


🔴 The pytest plugin at pytest_plugin.py:19 reads only config.option.cov_branch to decide arc vs line mode, but pytest-cov defines --cov-branch with default=None, so users who enable branch coverage via branch = True in .coveragerc or [tool.coverage.run] in pyproject.toml (without passing the CLI flag) cause cov_branch to evaluate to None or False = False, setting _QD_KCOV_ARC=0. pytest-cov actually runs in arc mode (coverage.py applies the config-file setting), so _qd_kcov.* gets written in line format while .coverage is in arc format. coverage combine then fails with "Can not mix line and arc data", and the fallback in tests/coverage_report.py:combine_coverage() silently drops all kernel coverage. Fix: read the authoritative resolved value via config.pluginmanager.get_plugin("_cov").cov_controller.cov.config.branch.

Extended reasoning...

What the bug is and how it manifests

The pytest plugin in python/quadrants/pytest_plugin.py (lines 10-20) tries to inform _kernel_coverage whether pytest-cov is running in branch (arc) mode by setting the _QD_KCOV_ARC env var. The check at line 19 is:

cov_branch = getattr(config.option, "cov_branch", False) or False os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"

This reflects only the pytest CLI flag --cov-branch, not the branch setting from coverage.py configuration files. pytest-cov registers --cov-branch with default=None and action='store_true' (verified at /usr/local/lib/python3.11/dist-packages/pytest_cov/plugin.py:160-163), so config.option.cov_branch is None whenever the CLI flag is absent. None or False == False, so _QD_KCOV_ARC gets set to '0' (line mode).

But pytest-cov passes branch=options.cov_branch directly to coverage.Coverage(...) (engine.py:61, 239). When None is passed, coverage.config.from_args() skips it (if v is not None, config.py:288-294), so any config-file [run] branch = True setting survives. pytest-cov is therefore in arc mode while the plugin sets _QD_KCOV_ARC=0.

The specific code path that triggers it

User has [run] branch = true in .coveragerc (or [tool.coverage.run] branch = true in pyproject.toml) and runs pytest --cov=my_pkg without --cov-branch.

Quadrants pytest plugin runs: config.option.cov_branch is None; None or False = False; sets _QD_KCOV_ARC=0.

_kernel_coverage._detect_arc_mode() reads _QD_KCOV_ARC=0 and returns False (line mode).

flush() writes _qd_kcov.<pid> in line format.

Meanwhile pytest-cov writes .coverage in arc format (because coverage.Coverage(branch=None) reads .coveragerc and applies branch=True).

tests/coverage_report.py:combine_coverage() runs coverage combine .coverage.pytest _qd_kcov.* → DataError: Can not mix line and arc data.

The fallback at lines 50-51 (coverage combine .coverage.pytest alone) silently drops all kernel coverage.

Empirical verification

A verifier confirmed empirically with pytest-cov 7.1.0:

# With .coveragerc containing [run]\nbranch = True: cov = coverage.Coverage(source=['.'], branch=None, config_file='.coveragerc') print(cov.config.branch) # -> True # config.option.cov_branch -> None # cov_branch after `or False` -> False (incorrect) # cov_controller.cov.config.branch -> True (authoritative)

This proves the plugin's check disagrees with pytest-cov's actual mode.

Why existing code does not prevent it

The plugin guard getattr(config.option, "cov_branch", False) or False is purely a CLI-flag check. Coverage.py's three-layer config resolution (defaults → config file → from_args) is opaque to the plugin code. There is no fallback path that reads coverage.config after pytest-cov has resolved it. The PR's own combine_coverage() fallback masks the resulting error rather than surfacing it.

Impact

Any downstream user of quadrants who enables branch coverage via .coveragerc or pyproject.toml and runs pytest --cov=my_pkg without explicitly passing --cov-branch will silently lose all kernel coverage. This is a common, idiomatic configuration pattern — many Python projects keep coverage settings in config files rather than command lines. The failure is silent: the coverage comment on the PR shows pytest-cov coverage but no kernel data, with no warning or error visible to the user.

Step-by-step proof

Project has pyproject.toml containing [tool.coverage.run] with branch = true.

Developer runs pytest --cov=my_pkg tests/ (no --cov-branch).

pytest-cov initializes: options.cov_branch = None. Coverage(branch=None, ...) reads pyproject.toml, sets config.branch = True. Tracer starts in arc mode.

quadrants pytest plugin's pytest_configure fires: config.option.cov_branch = None; None or False = False; _QD_KCOV_ARC = "0".

Tests run. Kernels execute. Probes accumulate.

atexit triggers flush(): _detect_arc_mode() reads _QD_KCOV_ARC=0 → False → add_lines path → _qd_kcov.<pid> in line format.

CI runs combine_coverage(): coverage combine .coverage.pytest _qd_kcov.* → DataError: Can not mix line and arc data → fallback drops kernel files → final .coverage contains only pytest-cov data.

Coverage report shows 0% kernel coverage.

How to fix it

Query the authoritative source — pytest-cov's already-constructed Coverage instance — instead of the CLI option:

def pytest_configure(config): if not config.pluginmanager.hasplugin("_cov"): return os.environ.setdefault("QD_KERNEL_COVERAGE", "1") if os.environ.get("QD_KERNEL_COVERAGE") != "1": return branch = False cov_plugin = config.pluginmanager.get_plugin("_cov") if cov_plugin is not None: ctrl = getattr(cov_plugin, "cov_controller", None) if ctrl is not None and getattr(ctrl, "cov", None) is not None: branch = bool(ctrl.cov.config.branch) os.environ["_QD_KCOV_ARC"] = "1" if branch else "0"

pytest-cov constructs cov_controller in pytest_load_initial_conftests (with tryfirst=True), which runs strictly before pytest_configure, so cov_controller.cov.config.branch is already authoritative and queryable by the time this hook fires. This reflects the merged result of CLI flags, config files, and defaults — matching pytest-cov's actual runtime behavior.

github-actions · 2026-04-24T23:07:16Z

Coverage Report (`40a0ca011`)

Metric	Value
Diff coverage (changed lines only)	90%
Overall project coverage	72%

Total: 450 lines, 47 missing, 90% covered

🔴 python/quadrants/lang/_fast_caching/src_hasher.py (0%)

🔴    2  import os
     53              "kcov" if os.environ.get("QD_KERNEL_COVERAGE") == "1" else "",

🟢 python/quadrants/lang/_func_base.py (82%)

🔴    7  import os
     25  
🔴   26  def _kernel_coverage_enabled() -> bool:
🟢   27      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     28  
     29  
🟢  255          _kcov = None
🟢  256          if _kernel_coverage_enabled() and autodiff_mode == _qd_core.AutodiffMode.NONE:
🟢  257              from . import (  # pylint: disable=import-outside-toplevel
    258                  _kernel_coverage as _kcov,
    259              )
    260  
🟢  261              tree = _kcov.rewrite_ast(tree, function_source_info.filepath, function_source_info.start_lineno)
    262  
🟢  266          if _kcov is not None:
🟢  267              cov_field = _kcov.get_field()
🟢  268              if cov_field is not None:
🟢  269                  global_vars[_kcov.FIELD_VAR_NAME] = cov_field

🔴 python/quadrants/lang/_kernel_coverage.py (79%)

      1  """Kernel code coverage via Python AST rewriting.
      2  
      3  When enabled (QD_KERNEL_COVERAGE=1), this module rewrites kernel and func ASTs to insert coverage probes — field
      4  stores that record which source lines actually execute on the GPU. At process exit, the collected data is written
      5  to a .coverage file compatible with coverage.py / pytest-cov / diff-cover.
      6  
      7  The probes are compiled as ordinary field stores by the existing pipeline, so no C++ changes are needed. When
      8  disabled, this module is never imported and has zero impact on the normal runtime path.
      9  """
     10  
🟢   11  import ast
🟢   12  import atexit
🟢   13  import logging
🟢   14  import os
🟢   15  import threading
🟢   16  import warnings
🟢   17  from typing import TYPE_CHECKING
     18  
🟢   19  from coverage import CoverageData  # type: ignore[import-not-found]
     20  
🟢   21  import quadrants as qd
🟢   22  from quadrants.lang import impl
     23  
     24  if TYPE_CHECKING:
     25      from quadrants.lang.field import ScalarField
     26  
🟢   27  FIELD_VAR_NAME = "_qd_cov"
🟢   28  _MAX_PROBES = int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
     29  
🟢   30  _lock = threading.Lock()
🟢   31  _cov_field: "ScalarField | None" = None
🟢   32  _cov_field_prog: object | None = None  # tracks which Program instance owns _cov_field
🟢   33  _probe_counter: int = 0
     34  # {probe_id: (filepath, absolute_lineno)}
🟢   35  _probe_map: dict[int, tuple[str, int]] = {}
     36  # Accumulated coverage lines surviving across qd.init() resets
🟢   37  _accumulated_lines: dict[str, set[int]] = {}
🟢   38  _reset_hook_installed: bool = False
     39  # Directory for .coverage and _qd_kcov.* files, captured when coverage is first enabled
🟢   40  _coverage_dir: str | None = None
     41  
     42  
🟢   43  def _harvest_field() -> None:
     44      """Read probe data from the current field into _accumulated_lines.
     45  
     46      Must be called while the runtime is still alive (before clear()).
     47      """
     48      global _cov_field, _cov_field_prog
🟢   49      with _lock:
🟢   50          if _cov_field is None or not _probe_map:
🟢   51              return
🟢   52          field_ref = _cov_field
🟢   53          probe_snapshot = dict(_probe_map)
🟢   54          _cov_field = None
🟢   55          _cov_field_prog = None
🟢   56      try:
🟢   57          arr = field_ref.to_numpy()
🟢   58      except Exception:
🟢   59          logging.warning("Failed to read coverage field, coverage data for this session will be lost", exc_info=True)
🟢   60          return
🟢   61      with _lock:
🟢   62          for probe_id, (filepath, lineno) in probe_snapshot.items():
🟢   63              if probe_id < len(arr) and arr[probe_id] != 0:
🟢   64                  _accumulated_lines.setdefault(filepath, set()).add(lineno)
     65  
     66  
🟢   67  def _install_reset_hook() -> None:
     68      """Monkey-patch PyQuadrants.clear() to harvest probes before destruction."""
     69      global _reset_hook_installed
🟢   70      if _reset_hook_installed:
🟢   71          return
🟢   72      _original_clear = impl.PyQuadrants.clear
     73  
🟢   74      def _hooked_clear(self) -> None:
🟢   75          _harvest_field()
🟢   76          _original_clear(self)
     77  
🟢   78      impl.PyQuadrants.clear = _hooked_clear  # type: ignore[assignment]
🟢   79      _reset_hook_installed = True
     80  
     81  
🟢   82  def ensure_field_allocated() -> None:
     83      """Allocate (or re-allocate after qd.init()) the global coverage field."""
     84      global _cov_field, _cov_field_prog, _coverage_dir
🟢   85      _install_reset_hook()
🟢   86      if _coverage_dir is None:
🟢   87          _coverage_dir = os.getcwd()
🟢   88      current_prog = impl.get_runtime()._prog
🟢   89      if _cov_field is not None and _cov_field_prog is current_prog:
🟢   90          return
🟢   91      with _lock:
🟢   92          current_prog = impl.get_runtime()._prog
🟢   93          if _cov_field is not None and _cov_field_prog is current_prog:
🔴   94              return
🟢   95          _cov_field = qd.field(dtype=qd.i32, shape=(_MAX_PROBES,))  # type: ignore[assignment]
🟢   96          _cov_field_prog = current_prog
     97  
     98  
🟢   99  def get_field() -> "ScalarField | None":
🟢  100      with _lock:
🟢  101          if _cov_field_prog is not impl.get_runtime()._prog:
🔴  102              return None
🟢  103          return _cov_field
    104  
    105  
🟢  106  def rewrite_ast(tree: ast.Module, filepath: str, start_lineno: int) -> ast.Module:
    107      """Rewrite a kernel/func AST to insert coverage probes.
    108  
    109      Each executable statement at a new source line gets a probe: ``_qd_cov[<probe_id>] = 1``.
    110      Probes inside if/else bodies only fire when that branch is taken, giving true runtime branch coverage.
    111      """
    112      global _probe_counter
🟢  113      with _lock:
🟢  114          rewriter = _CoverageASTRewriter(
    115              field_name=FIELD_VAR_NAME,
    116              filepath=filepath,
    117              start_lineno=start_lineno,
    118              probe_id_start=_probe_counter,
    119          )
🟢  120          tree = rewriter.visit(tree)
🟢  121          ast.fix_missing_locations(tree)
🟢  122          _probe_counter = rewriter.next_probe_id
🟢  123          _probe_map.update(rewriter.probe_map)
🟢  124      return tree
    125  
    126  
🟢  127  def _detect_arc_mode() -> bool:
    128      """Detect whether pytest-cov is running in branch (arc) mode.
    129  
    130      Checks _QD_KCOV_ARC env var first (set by the pytest plugin), then falls back to reading .coverage.
    131      Defaults to False (line mode) when nothing is known, since ``pytest --cov`` without ``--cov-branch``
    132      is the more common invocation.
    133      """
🔴  134      arc_env = os.environ.get("_QD_KCOV_ARC")
🔴  135      if arc_env is not None:
🔴  136          return arc_env == "1"
🔴  137      try:
🔴  138          cov_path = os.path.join(_coverage_dir, ".coverage") if _coverage_dir else ".coverage"
🔴  139          cd = CoverageData(basename=cov_path)
🔴  140          cd.read()
🔴  141          if not cd.measured_files():
🔴  142              return False
🔴  143          return cd.has_arcs()
🔴  144      except Exception:
🔴  145          logging.debug("Failed to detect arc mode from .coverage file, defaulting to line mode", exc_info=True)
🔴  146          return False
    147  
    148  
🟢  149  def flush() -> None:
    150      """Harvest any remaining field data and write all results to a .coverage file.
    151  
    152      If .coverage.kernel already exists (e.g. from a prior test phase), the new data is merged into it so nothing
    153      is lost across multiple invocations.
    154      """
🔴  155      _harvest_field()
    156  
🔴  157      with _lock:
🔴  158          if not _accumulated_lines:
🔴  159              return
🔴  160          snapshot = {f: set(lines) for f, lines in _accumulated_lines.items()}
    161  
🔴  162      base_dir = _coverage_dir or os.getcwd()
🔴  163      kernel_path = os.path.join(base_dir, f"_qd_kcov.{os.getpid()}")
🔴  164      use_arcs = _detect_arc_mode()
    165  
🔴  166      cov = CoverageData(basename=kernel_path)
🔴  167      if use_arcs:
🔴  168          arcs_by_file: dict[str, list[tuple[int, int]]] = {}
🔴  169          for filepath, lines in snapshot.items():
    170              # Emit only entry/exit arcs per line — we know which lines ran but not the actual transitions
    171              # between them, so we avoid fabricating inter-line arcs that would misrepresent branch coverage.
🔴  172              arcs = []
🔴  173              for line in sorted(lines):
🔴  174                  arcs.append((-1, line))
🔴  175                  arcs.append((line, -1))
🔴  176              arcs_by_file[filepath] = arcs
🔴  177          cov.add_arcs(arcs_by_file)
    178      else:
🔴  179          cov.add_lines({f: sorted(lines) for f, lines in snapshot.items()})
🔴  180      cov.write()
    181  
    182  
🟢  183  _capacity_warning_emitted = False
    184  
    185  
🟢  186  class _CoverageASTRewriter(ast.NodeTransformer):
    187      """Insert coverage probes before each statement at a new source line."""
    188  
🟢  189      def __init__(self, field_name: str, filepath: str, start_lineno: int, probe_id_start: int) -> None:
🟢  190          self._field_name = field_name
🟢  191          self._filepath = filepath
🟢  192          self._start_lineno = start_lineno
🟢  193          self.next_probe_id = probe_id_start
🟢  194          self._seen_lines: set[int] = set()
🟢  195          self.probe_map: dict[int, tuple[str, int]] = {}
    196  
🟢  197      def _make_probe(self, abs_lineno: int, rel_lineno: int, col_offset: int) -> ast.Assign | None:
    198          global _capacity_warning_emitted
🟢  199          probe_id = self.next_probe_id
🟢  200          if probe_id >= _MAX_PROBES:
🟢  201              if not _capacity_warning_emitted:
🟢  202                  warnings.warn(
    203                      f"Kernel coverage probe capacity ({_MAX_PROBES}) exceeded. "
    204                      f"Additional kernel lines will not be tracked. "
    205                      f"Set QD_COVERAGE_MAX_PROBES to a higher value.",
    206                      stacklevel=2,
    207                  )
🟢  208                  _capacity_warning_emitted = True
🟢  209              return None
🟢  210          self.probe_map[probe_id] = (self._filepath, abs_lineno)
🟢  211          self.next_probe_id += 1
🟢  212          node = ast.Assign(
    213              targets=[
    214                  ast.Subscript(
    215                      value=ast.Name(id=self._field_name, ctx=ast.Load()),
    216                      slice=ast.Constant(value=probe_id),
    217                      ctx=ast.Store(),
    218                  )
    219              ],
    220              value=ast.Constant(value=1),
    221              lineno=rel_lineno,
    222              col_offset=col_offset,
    223              end_lineno=rel_lineno,
    224              end_col_offset=col_offset,
    225          )
🟢  226          return node
    227  
🟢  228      def _instrument_body(self, stmts: list[ast.stmt]) -> list[ast.stmt]:
🟢  229          result: list[ast.stmt] = []
🟢  230          for stmt in stmts:
🟢  231              rel_lineno = getattr(stmt, "lineno", None)
🟢  232              if rel_lineno is not None:
🟢  233                  abs_lineno = rel_lineno + self._start_lineno - 1
🟢  234                  if abs_lineno not in self._seen_lines:
🟢  235                      self._seen_lines.add(abs_lineno)
🟢  236                      col = getattr(stmt, "col_offset", 0)
🟢  237                      probe = self._make_probe(abs_lineno, rel_lineno, col)
🟢  238                      if probe is not None:
🟢  239                          result.append(probe)
🟢  240              result.append(self.visit(stmt))
🟢  241          return result
    242  
🟢  243      def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.FunctionDef:
🟢  244          node.body = self._instrument_body(node.body)
🟢  245          return node
    246  
🟢  247      def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef) -> ast.AsyncFunctionDef:
🔴  248          node.body = self._instrument_body(node.body)
🔴  249          return node
    250  
🟢  251      def visit_If(self, node: ast.If) -> ast.If:
🟢  252          node.body = self._instrument_body(node.body)
🟢  253          if node.orelse:
🟢  254              node.orelse = self._instrument_body(node.orelse)
🟢  255          return node
    256  
🟢  257      def visit_For(self, node: ast.For) -> ast.For:
🟢  258          node.body = self._instrument_body(node.body)
🟢  259          if node.orelse:
🟢  260              node.orelse = self._instrument_body(node.orelse)
🟢  261          return node
    262  
🟢  263      def visit_While(self, node: ast.While) -> ast.While:
🟢  264          node.body = self._instrument_body(node.body)
🟢  265          if node.orelse:
🟢  266              node.orelse = self._instrument_body(node.orelse)
🟢  267          return node
    268  
🟢  269      def visit_With(self, node: ast.With) -> ast.With:
🟢  270          node.body = self._instrument_body(node.body)
🟢  271          return node
    272  
🟢  273      def visit_Try(self, node: ast.Try) -> ast.Try:
🟢  274          node.body = self._instrument_body(node.body)
🟢  275          for handler in node.handlers:
🟢  276              handler.body = self._instrument_body(handler.body)
🟢  277          if node.orelse:
🟢  278              node.orelse = self._instrument_body(node.orelse)
🟢  279          if node.finalbody:
🟢  280              node.finalbody = self._instrument_body(node.finalbody)
🟢  281          return node
    282  
    283  
🟢  284  atexit.register(flush)

🟢 python/quadrants/lang/ast/ast_transformer_utils.py (100%)

🟢  335              if not name.startswith("_qd_"):
🟢  336                  reason = f"{name} is in global vars, therefore violates pure"
🟢  337                  violates_pure = True

🟢 python/quadrants/lang/kernel.py (80%)

     19  
🔴   20  def _kernel_coverage_enabled() -> bool:
🟢   21      return os.environ.get("QD_KERNEL_COVERAGE") == "1"
     22  
     23  
🟢  382          if _kernel_coverage_enabled():
🟢  383              from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    384  
🟢  385              _kernel_coverage.ensure_field_allocated()
    386

🟢 python/quadrants/lang/misc.py (100%)

    496  
🟢  497      if os.environ.get("QD_KERNEL_COVERAGE") == "1":
🟢  498          from . import _kernel_coverage  # pylint: disable=import-outside-toplevel
    499  
🟢  500          _kernel_coverage.ensure_field_allocated()
    501

🔴 python/quadrants/pytest_plugin.py (67%)

      1  """Pytest plugin that auto-enables kernel coverage when pytest-cov is active.
      2  
      3  Registered via the ``pytest11`` entry point so it loads automatically when quadrants is installed.
      4  Opt out by setting ``QD_KERNEL_COVERAGE=0`` explicitly.
      5  """
      6  
🔴    7  import os
      8  
      9  
🔴   10  def pytest_configure(config):
🟢   11      if not config.pluginmanager.hasplugin("_cov"):
🔴   12          return
🟢   13      os.environ.setdefault("QD_KERNEL_COVERAGE", "1")
🟢   14      if os.environ.get("QD_KERNEL_COVERAGE") != "1":
🟢   15          return
     16      # Tell the kernel coverage module whether pytest-cov is running in branch (arc) mode,
     17      # so it writes the matching format and avoids "Can not mix line and arc data" at combine time.
     18      # We read config.option.cov_branch which pytest-cov has already populated by this point.
🟢   19      cov_branch = getattr(config.option, "cov_branch", False) or False
🟢   20      os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"

🟢 tests/python/quadrants/lang/fast_caching/test_src_ll_cache.py (100%)

     11  
🟢   12  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
🟢   67  @pytest.mark.skipif(
     68      _KERNEL_COVERAGE,
     69      reason="Coverage probes change LLVM IR addresses after reinit, breaking recompile comparison",
     70  )

🟢 tests/python/quadrants/lang/test_kernel_impl.py (100%)

🟢    1  import os
🟢   11  _KERNEL_COVERAGE = os.environ.get("QD_KERNEL_COVERAGE") == "1"
     13  
🟢   14  @pytest.mark.skipif(
     15      _KERNEL_COVERAGE,
     16      reason="Coverage probes change the kernel AST, preventing FE-LL cache hits after reinit",
     17  )

🟢 tests/python/test_api.py (100%)

🟢  438      actual = sorted([s for s in dir(src) if not s.startswith(("_", "@")) and s != "pytest_plugin"])

🟢 tests/python/test_kernel_coverage.py (99%)

      1  """Tests for kernel code coverage instrumentation.
      2  
      3  These tests verify that the AST rewriter correctly inserts coverage probes and that the probes fire when kernel
      4  code executes on the device.
      5  """
      6  
🟢    7  import ast
🟢    8  import os
🟢    9  import textwrap
     10  
🟢   11  import pytest
     12  
🟢   13  import quadrants as qd
     14  
🟢   15  from tests import test_utils
     16  
     17  # These tests only run when QD_KERNEL_COVERAGE=1
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE", "") != "1",
     20      reason="QD_KERNEL_COVERAGE=1 not set",
     21  )
     22  
     23  
     24  # ---------------------------------------------------------------------------
     25  # AST rewriter unit tests
     26  # ---------------------------------------------------------------------------
     27  
🟢   28  _AST_REWRITER_CASES = [
     29      pytest.param(
     30          """\
     31          def f():
     32              x = 1
     33              y = 2
     34              return x + y
     35          """,
     36          {11, 12, 13},
     37          10,
     38          id="straight_line",
     39      ),
     40      pytest.param(
     41          """\
     42          def f():
     43              if x > 0:
     44                  a = 1
     45              else:
     46                  b = 2
     47          """,
     48          {2, 3, 5},
     49          1,
     50          id="if_else",
     51      ),
     52      pytest.param(
     53          """\
     54          def f():
     55              for i in range(10):
     56                  x = i
     57          """,
     58          {2, 3},
     59          1,
     60          id="for_loop",
     61      ),
     62      pytest.param(
     63          """\
     64          def f():
     65              while x > 0:
     66                  x = x - 1
     67              else:
     68                  y = 0
     69          """,
     70          {2, 3, 5},
     71          1,
     72          id="while_loop_else",
     73      ),
     74      pytest.param(
     75          """\
     76          def f():
     77              with ctx:
     78                  a = 1
     79                  b = 2
     80          """,
     81          {2, 3, 4},
     82          1,
     83          id="with_statement",
     84      ),
     85      pytest.param(
     86          """\
     87          def f():
     88              try:
     89                  a = 1
     90              except:
     91                  b = 2
     92              else:
     93                  c = 3
     94              finally:
     95                  d = 4
     96          """,
     97          {3, 5, 7, 9},
     98          1,
     99          id="try_except_finally",
    100      ),
    101  ]
    102  
    103  
🟢  104  @pytest.mark.parametrize("src,expected_lines,start_lineno", _AST_REWRITER_CASES)
🟢  105  def test_ast_rewriter(src, expected_lines, start_lineno):
    106      """Verify the AST rewriter inserts probes at the expected source lines."""
🟢  107      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    108  
🟢  109      tree = ast.parse(textwrap.dedent(src))
🟢  110      rewriter = _CoverageASTRewriter(
    111          field_name="_qd_cov", filepath="test.py", start_lineno=start_lineno, probe_id_start=0
    112      )
🟢  113      rewriter.visit(tree)
    114  
🟢  115      covered_lines = {lineno for _, (_, lineno) in rewriter.probe_map.items()}
🟢  116      assert expected_lines.issubset(covered_lines), f"Expected lines {expected_lines} to be probed, got {covered_lines}"
    117  
    118  
🟢  119  def test_ast_rewriter_capacity_limit():
    120      """Verify that probes stop being inserted when the capacity limit is hit."""
🟢  121      import warnings
    122  
🟢  123      import quadrants.lang._kernel_coverage as kcov
🟢  124      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    125  
🟢  126      src = textwrap.dedent(
    127          """\
    128          def f():
    129              a = 1
    130              b = 2
    131              c = 3
    132      """
    133      )
🟢  134      tree = ast.parse(src)
🟢  135      old_warning_state = kcov._capacity_warning_emitted
🟢  136      kcov._capacity_warning_emitted = False
🟢  137      try:
🟢  138          with warnings.catch_warnings(record=True) as w:
🟢  139              warnings.simplefilter("always")
🟢  140              rewriter = _CoverageASTRewriter(
    141                  field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=kcov._MAX_PROBES - 1
    142              )
🟢  143              rewriter.visit(tree)
    144  
🟢  145          assert rewriter.next_probe_id == kcov._MAX_PROBES
🟢  146          assert len(rewriter.probe_map) == 1, f"Only 1 probe should fit, got {len(rewriter.probe_map)}"
🟢  147          assert len(w) == 1
🟢  148          assert "exceeded" in str(w[0].message).lower()
    149      finally:
🟢  150          kcov._capacity_warning_emitted = old_warning_state
    151  
    152  
🟢  153  def test_ast_rewriter_deduplicates_same_line():
    154      """Verify that two statements on the same source line get only one probe."""
🟢  155      from quadrants.lang._kernel_coverage import _CoverageASTRewriter
    156  
🟢  157      src = "def f():\n    a = 1; b = 2\n"
🟢  158      tree = ast.parse(src)
🟢  159      rewriter = _CoverageASTRewriter(field_name="_qd_cov", filepath="test.py", start_lineno=1, probe_id_start=0)
🟢  160      rewriter.visit(tree)
    161  
🟢  162      abs_lines = [lineno for _, (_, lineno) in rewriter.probe_map.items()]
🟢  163      assert abs_lines.count(2) == 1, f"Line 2 should have exactly one probe, got {abs_lines.count(2)}"
    164  
    165  
🟢  166  def test_env_var_max_probes():
    167      """Verify that QD_COVERAGE_MAX_PROBES env var is read at import time."""
🟢  168      import quadrants.lang._kernel_coverage as kcov
    169  
🟢  170      assert kcov._MAX_PROBES == int(os.environ.get("QD_COVERAGE_MAX_PROBES", "100000"))
    171  
    172  
🟢  173  def test_harvest_field_exception_path():
    174      """Verify that _harvest_field handles to_numpy() failure gracefully."""
🟢  175      from unittest.mock import MagicMock
    176  
🟢  177      import quadrants.lang._kernel_coverage as kcov
    178  
🟢  179      old_field = kcov._cov_field
🟢  180      old_prog = kcov._cov_field_prog
🟢  181      old_map = kcov._probe_map.copy()
🟢  182      try:
🟢  183          mock_field = MagicMock()
🟢  184          mock_field.to_numpy.side_effect = RuntimeError("runtime destroyed")
🟢  185          kcov._cov_field = mock_field
🟢  186          kcov._cov_field_prog = object()
🟢  187          kcov._probe_map[999999] = ("fake.py", 1)
    188  
    189          # Should not raise — the exception is caught and logged
🟢  190          kcov._harvest_field()
    191  
🟢  192          assert kcov._cov_field is None, "Field should be cleared after failure"
🟢  193          assert kcov._cov_field_prog is None, "Field prog should be cleared after failure"
    194      finally:
🟢  195          kcov._cov_field = old_field
🟢  196          kcov._cov_field_prog = old_prog
🟢  197          kcov._probe_map = old_map
    198  
    199  
    200  # ---------------------------------------------------------------------------
    201  # End-to-end tests
    202  # ---------------------------------------------------------------------------
    203  
    204  
🟢  205  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  206  def test_kernel_coverage_branches_e2e():
    207      """Verify that only the taken branch has its probe fired."""
🟢  208      from quadrants.lang import _kernel_coverage
    209  
🟢  210      _kernel_coverage.ensure_field_allocated()
    211  
🟢  212      probe_count_before = _kernel_coverage._probe_counter
🟢  213      out = qd.field(dtype=qd.i32, shape=(1,))
    214  
🟢  215      @qd.kernel
🟢  216      def branching_kernel():
🟢  217          x = 10
🟢  218          if x > 5:
🟢  219              out[0] = 1
    220          else:
🔴  221              out[0] = 2
    222  
🟢  223      branching_kernel()
    224  
🟢  225      assert out[0] == 1
    226  
🟢  227      cov_field = _kernel_coverage.get_field()
🟢  228      arr = cov_field.to_numpy()
    229  
🟢  230      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    231  
🟢  232      taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] != 0}
🟢  233      not_taken_probes = {pid for pid, loc in probes_for_kernel.items() if arr[pid] == 0}
    234  
🟢  235      assert len(taken_probes) > 0, "At least some probes should have fired"
🟢  236      assert len(not_taken_probes) > 0, "The else branch should not have been reached"
    237  
    238  
🟢  239  @test_utils.test(arch=qd.gpu)
🟢  240  def test_kernel_coverage_simt_e2e():
    241      """Verify coverage probes track branches with block.sync() and subgroup shuffle.
    242  
    243      The if/else is based on a runtime value read from a field, so the compiler cannot constant-fold it away.
    244      Only the taken branch's shuffle probe should fire.
    245      """
🟢  246      from quadrants.lang import _kernel_coverage
🟢  247      from quadrants.lang.simt import subgroup
    248  
🟢  249      _kernel_coverage.ensure_field_allocated()
    250  
🟢  251      N = 64
🟢  252      probe_count_before = _kernel_coverage._probe_counter
🟢  253      flag = qd.field(dtype=qd.i32, shape=(1,))
🟢  254      a = qd.field(dtype=qd.i32, shape=(N,))
🟢  255      out = qd.field(dtype=qd.i32, shape=(N,))
    256  
🟢  257      flag[0] = 1  # runtime value: take the if-branch
    258  
🟢  259      @qd.kernel
🟢  260      def simt_kernel():
🟢  261          qd.loop_config(block_dim=N)
🟢  262          for i in range(N):
🟢  263              a[i] = i + 1
🟢  264              qd.simt.block.sync()
🟢  265              if flag[0] > 0:
🟢  266                  val = subgroup.shuffle(a[i], qd.u32(0))
🟢  267                  out[i] = val
    268              else:
🔴  269                  val = subgroup.shuffle(a[i], qd.u32(1))
🔴  270                  out[i] = val + 100
    271  
🟢  272      simt_kernel()
    273  
🟢  274      for i in range(4):
🟢  275          assert out[i] == 1, f"Expected 1 at index {i}, got {out[i]}"
    276  
🟢  277      cov_field = _kernel_coverage.get_field()
🟢  278      arr = cov_field.to_numpy()
    279  
🟢  280      probes_for_kernel = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
    281  
🟢  282      fired = {pid for pid in probes_for_kernel if arr[pid] != 0}
🟢  283      not_fired = {pid for pid in probes_for_kernel if arr[pid] == 0}
🟢  284      assert len(fired) >= 4, f"Expected at least 4 probes to fire, got {len(fired)}"
🟢  285      assert len(not_fired) >= 2, "The else branch should not have been reached"
    286  
    287  
🟢  288  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  289  def test_kernel_coverage_survives_reinit():
    290      """Verify that coverage data accumulated before qd.init() reset is preserved.
    291  
    292      Runs a kernel, then resets via qd.reset()/qd.init() (which triggers the _hooked_clear harvest), runs another
    293      kernel, harvests again, and checks that _accumulated_lines contains data from both sessions.
    294      """
🟢  295      from quadrants.lang import _kernel_coverage, impl
    296  
🟢  297      current_arch = impl.get_runtime()._arch
🟢  298      _kernel_coverage.ensure_field_allocated()
    299  
🟢  300      probe_count_before = _kernel_coverage._probe_counter
🟢  301      out1 = qd.field(dtype=qd.i32, shape=(1,))
    302  
🟢  303      @qd.kernel
🟢  304      def kernel_before_reset():
🟢  305          out1[0] = 1
    306  
🟢  307      kernel_before_reset()
    308  
🟢  309      cov_field = _kernel_coverage.get_field()
🟢  310      assert cov_field is not None
🟢  311      arr = cov_field.to_numpy()
🟢  312      probes_first = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  313      fired_first = {pid for pid in probes_first if arr[pid] != 0}
🟢  314      assert len(fired_first) > 0, "Probes from first kernel should have fired"
    315  
    316      # Don't call _harvest_field() manually — let qd.reset() trigger it via the _hooked_clear hook
🟢  317      qd.reset()
    318  
    319      # Verify the hook harvested data from the first session
🟢  320      files_before = set(_kernel_coverage._accumulated_lines.keys())
🟢  321      assert len(files_before) > 0, "Hook should have harvested data during reset"
🟢  322      lines_before = {}
🟢  323      for f, lines in _kernel_coverage._accumulated_lines.items():
🟢  324          lines_before[f] = set(lines)
    325  
🟢  326      qd.init(arch=current_arch)
    327  
🟢  328      _kernel_coverage.ensure_field_allocated()
    329  
🟢  330      probe_count_mid = _kernel_coverage._probe_counter
🟢  331      out2 = qd.field(dtype=qd.i32, shape=(1,))
    332  
🟢  333      @qd.kernel
🟢  334      def kernel_after_reset():
🟢  335          out2[0] = 2
    336  
🟢  337      kernel_after_reset()
    338  
🟢  339      _kernel_coverage._harvest_field()
    340  
🟢  341      for f in files_before:
🟢  342          assert (
    343              f in _kernel_coverage._accumulated_lines
    344          ), f"File {f} from before reset should still be in _accumulated_lines"
🟢  345          assert lines_before[f].issubset(
    346              _kernel_coverage._accumulated_lines[f]
    347          ), "Lines from before reset should be preserved"
    348  
🟢  349      probes_second = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_mid}
🟢  350      second_files = {loc[0] for loc in probes_second.values()}
🟢  351      for f in second_files:
🟢  352          assert f in _kernel_coverage._accumulated_lines, f"File {f} from second kernel should be in _accumulated_lines"
    353  
    354  
🟢  355  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  356  def test_kernel_coverage_autodiff():
    357      """Verify that autodiff forward pass produces probes but backward does not.
    358  
    359      The forward compilation (AutodiffMode.NONE) should insert probes that fire. The backward compilation
    360      (AutodiffMode.REVERSE) should not add any probes.
    361      """
🟢  362      from quadrants.lang import _kernel_coverage
    363  
🟢  364      _kernel_coverage.ensure_field_allocated()
    365  
🟢  366      x = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
🟢  367      loss = qd.field(dtype=qd.f32, shape=(), needs_grad=True)
    368  
🟢  369      @qd.kernel
🟢  370      def compute():
🟢  371          loss[None] = x[None] * x[None]
    372  
🟢  373      x[None] = 5.0
    374  
🟢  375      probe_count_before = _kernel_coverage._probe_counter
    376  
🟢  377      with qd.ad.Tape(loss):
🟢  378          compute()
    379  
🟢  380      probe_count_after_tape = _kernel_coverage._probe_counter
🟢  381      forward_probes = probe_count_after_tape - probe_count_before
🟢  382      assert forward_probes > 0, "Forward compilation should have inserted probes"
    383  
    384      # Verify forward probes actually fired
🟢  385      cov_field = _kernel_coverage.get_field()
🟢  386      assert cov_field is not None
🟢  387      arr = cov_field.to_numpy()
🟢  388      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  389      fired = {pid for pid in probes if arr[pid] != 0}
🟢  390      assert len(fired) > 0, "Forward pass inside Tape should produce fired coverage probes"
    391  
    392      # Verify backward pass computes correct gradients
🟢  393      assert loss[None] == pytest.approx(25.0)
🟢  394      assert x.grad[None] == pytest.approx(10.0)
    395  
    396  
🟢  397  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  398  def test_kernel_coverage_qd_func():
    399      """Verify that probes fire inside a @qd.func called from a kernel."""
🟢  400      from quadrants.lang import _kernel_coverage
    401  
🟢  402      _kernel_coverage.ensure_field_allocated()
    403  
🟢  404      probe_count_before = _kernel_coverage._probe_counter
🟢  405      out = qd.field(dtype=qd.i32, shape=(1,))
    406  
🟢  407      @qd.func
🟢  408      def helper():
🟢  409          out[0] = 99
    410  
🟢  411      @qd.kernel
🟢  412      def caller():
🟢  413          helper()
    414  
🟢  415      caller()
    416  
🟢  417      assert out[0] == 99
    418  
🟢  419      cov_field = _kernel_coverage.get_field()
🟢  420      assert cov_field is not None
🟢  421      arr = cov_field.to_numpy()
    422  
🟢  423      probes = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_before}
🟢  424      fired = {pid for pid in probes if arr[pid] != 0}
    425      # The kernel body has one statement (helper()), and the func body has one (out[0] = 99).
    426      # Both should produce probes that fire.
🟢  427      assert (
    428          len(fired) >= 2
    429      ), f"Expected probes from both kernel and func to fire, got {len(fired)} fired out of {len(probes)}"
    430  
    431  
🟢  432  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  433  def test_kernel_coverage_multiple_kernels_same_session():
    434      """Verify that probes from two different kernels both fire in the same session."""
🟢  435      from quadrants.lang import _kernel_coverage
    436  
🟢  437      _kernel_coverage.ensure_field_allocated()
    438  
🟢  439      probe_count_before = _kernel_coverage._probe_counter
🟢  440      a = qd.field(dtype=qd.i32, shape=(1,))
🟢  441      b = qd.field(dtype=qd.i32, shape=(1,))
    442  
🟢  443      @qd.kernel
🟢  444      def kernel_a():
🟢  445          a[0] = 10
    446  
🟢  447      @qd.kernel
🟢  448      def kernel_b():
🟢  449          b[0] = 20
    450  
🟢  451      kernel_a()
🟢  452      probe_count_after_a = _kernel_coverage._probe_counter
🟢  453      kernel_b()
    454  
🟢  455      assert a[0] == 10
🟢  456      assert b[0] == 20
    457  
🟢  458      cov_field = _kernel_coverage.get_field()
🟢  459      arr = cov_field.to_numpy()
    460  
🟢  461      probes_a = {
    462          pid: loc for pid, loc in _kernel_coverage._probe_map.items() if probe_count_before <= pid < probe_count_after_a
    463      }
🟢  464      probes_b = {pid: loc for pid, loc in _kernel_coverage._probe_map.items() if pid >= probe_count_after_a}
    465  
🟢  466      fired_a = {pid for pid in probes_a if arr[pid] != 0}
🟢  467      fired_b = {pid for pid in probes_b if arr[pid] != 0}
    468  
🟢  469      assert len(fired_a) > 0, "Probes from kernel_a should have fired"
🟢  470      assert len(fired_b) > 0, "Probes from kernel_b should have fired"
    471  
    472  
🟢  473  @test_utils.test(arch=[qd.cpu, qd.cuda])
🟢  474  def test_qd_prefix_exemption_pure_kernel():
    475      """Verify that _qd_-prefixed globals don't violate pure kernel checks.
    476  
    477      With kernel coverage enabled, _qd_cov is injected as a global. This test verifies that a pure (fastcache)
    478      kernel still compiles without error. The kernel uses ndarray arguments (not global fields) because pure
    479      kernels prohibit non-_qd_ globals.
    480      """
🟢  481      a = qd.ndarray(qd.i32, (1,))
    482  
🟢  483      @qd.kernel(fastcache=True)
🟢  484      def pure_kernel(arr: qd.types.NDArray) -> None:
🟢  485          arr[0] = 42
    486  
🟢  487      pure_kernel(a)
🟢  488      assert a[0] == 42

🟢 tests/python/test_offline_cache.py (100%)

     16  # Coverage field allocation creates internal fill kernels that change cache file counts.
     17  # CI runs these tests in a separate phase without QD_KERNEL_COVERAGE (see 4_test.sh).
🟢   18  pytestmark = pytest.mark.skipif(
     19      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     20      reason="Kernel coverage adds internal kernels that invalidate cache file count assertions",
     21  )
     22

🟢 tests/python/test_snode_layout_inspection.py (100%)

🟢    1  import os
      2  
🟢    3  import pytest
      4  
🟢   10  @pytest.mark.skipif(
     11      os.environ.get("QD_KERNEL_COVERAGE") == "1",
     12      reason="Kernel coverage field on root shifts offset assertions",
     13  )

github-actions · 2026-04-25T21:04:11Z

Coverage Report (`2b7d0561d`)

Metric	Value
Diff coverage (changed lines only)	0%
Overall project coverage	73%

Total: 0 lines, 0 missing, 0% covered

* [Misc] Warn user to disable caching when print_ir/QD_DUMP_IR enabled (Genesis-Embodied-AI#425) Co-authored-by: v01dxyz <v01dxyz@v01d.xyz> * [Build] Pin torch version to CUDA 12.8 for CUDA tests (Genesis-Embodied-AI#428) * [Misc] Fixing up taichi-dev urls (Genesis-Embodied-AI#429) * [Perf] Rename cuda_graph to gpu_graph across the codebase (Genesis-Embodied-AI#430) * Misc: fix typo integeral -> integral (Genesis-Embodied-AI#434) Co-authored-by: v01dxyz <v01dxyz@v01d.xyz> * [Perf] CUDA graph 4: call from multiple locations (Genesis-Embodied-AI#420) * [Bug] Fix fastcache not restoring graph_do_while_arg (Genesis-Embodied-AI#435) * [Perf] Cache last-call result in perf_dispatch for single-compatible case (Genesis-Embodied-AI#438) * Fix gpu_graph fallback on old Nvidia GPU. (Genesis-Embodied-AI#443) * Fix shared memory offset not reset between CUDA kernels. (Genesis-Embodied-AI#442) * [Misc] Allow disabling GPU graph via QD_GPU_GRAPH=0 env var (Genesis-Embodied-AI#439) * [Misc] Add named top-level loops (Genesis-Embodied-AI#440) * [Misc] Rename gpu_graph to graph (Genesis-Embodied-AI#446) * [Misc] Add cross-platform shuffle (Genesis-Embodied-AI#447) * [Bug] Fix graph_do_while on Windows: search for cudadevrt.lib (Genesis-Embodied-AI#456) * [Bug] Also search default CUDA toolkit install location on Windows (Genesis-Embodied-AI#461) * [SPIRV] Feature Parity Atomics & Shared Array (Genesis-Embodied-AI#432) * [Misc] Change clang format to 120 characters (Genesis-Embodied-AI#463) * [Misc] CUDA graph 5 Add fatbin (Genesis-Embodied-AI#464) * [Bug] Reuse VkInstance across init/reset cycles (Genesis-Embodied-AI#465) * [Perf] Tiles 1: _load, _store, _eye_ (Genesis-Embodied-AI#466) * [Misc] Remove dead InternalFuncStmt type_check override (Genesis-Embodied-AI#471) * [Perf] Tiles 2: add cholesky and ger (Genesis-Embodied-AI#472) * [Perf] Tiles 2b: add triangular solve (Genesis-Embodied-AI#474) * [Misc] Refactor: use _get_col/_set_col in tiles load/store/init (Genesis-Embodied-AI#475) * [Build] Fix flaky test_clock_accuracy (Genesis-Embodied-AI#436) * Fix AARCH64 emitting invalid asm in CUDA kernels. (Genesis-Embodied-AI#473) Co-authored-by: Hugh Perkins <hughperkins@gmail.com> * [AMDGPU] Enable HIP memory pool and surface pool-exhaustion errors. (Genesis-Embodied-AI#485) * [AMDGPU] Scope hsaco tmp dir per-user to avoid collisions. (Genesis-Embodied-AI#484) * [Perf] Tiles 3: Add slice syntax, qd.outer() and initial doc (Genesis-Embodied-AI#477) * [AMDGPU] Fix gradient computation. (Genesis-Embodied-AI#486) * Enable all backends that are supported in unit tests. (Genesis-Embodied-AI#488) * Fix SPIRV ID overflow for large kernels due to autodiff. (Genesis-Embodied-AI#489) * [Misc] Fix purity checker to allow accessing constants from quadrants modules (Genesis-Embodied-AI#487) * [Misc] Increase tolerance for clock monotonic test (Genesis-Embodied-AI#492) * [CI] Serialize api doc workflow (Genesis-Embodied-AI#494) * [CI] Increase tolerance for clock test (Genesis-Embodied-AI#506) * [CI] Increase clock test tolerance to 20% (Genesis-Embodied-AI#509) * [Perf] Add tensor_type parametrization to tile16 tests (Genesis-Embodied-AI#504) * [Perf] Tiles 4b: Migrate tiles16 tests to enable fastcache (Genesis-Embodied-AI#505) * [Perf] Tiles 4c: add Tiles16x16 proxy (Genesis-Embodied-AI#507) * [Perf] Tiles 4d: Consolidate slice error tests using parametrize (Genesis-Embodied-AI#508) * [Perf] Tiles 4: add SharedArray slice support (Genesis-Embodied-AI#482) * [Perf] Tiles 5: add Cholesky benchmark demo (Genesis-Embodied-AI#483) * [Doc] Add user guide page for subgroup shuffle (Genesis-Embodied-AI#512) * [Perf] Implement cross-platform shuffle_down (Genesis-Embodied-AI#510) * [Perf] Add portable subgroup reduce_add and reduce_all_add (Genesis-Embodied-AI#511) * [Perf] Add first warmup config to perf dispatch (Genesis-Embodied-AI#422) * [AutoDiff] Autodiff 1: Add baseline adstack regression test for unary_collections (Genesis-Embodied-AI#500) * [AutoDiff] Autodiff 2: Implement derivative for tan (Genesis-Embodied-AI#501) * [AutoDiff] Autodiff 3: Recompute tanh/exp on the operand in the reverse pass (Genesis-Embodied-AI#502) * [AutoDiff] Autodiff 4: Mark rsqrt as non-linear for adstack promotion (Genesis-Embodied-AI#503) * [AutoDiff] Autodiff 5: Fix adjoint-alloca placement for GlobalLoads outside the current range-for (Genesis-Embodied-AI#496) * [AutoDiff] Autodiff 6: Adstack regression tests (Genesis-Embodied-AI#491) * [AutoDiff] Autodiff 7: Fix header size in AdStackAllocaStmt to match u64 runtime layout (Genesis-Embodied-AI#534) * [AutoDiff] Autodiff 8: Surface LLVM adstack push/pop overflow as a Python exception (Genesis-Embodied-AI#535) * [AutoDiff] Autodiff 9: Guard against LLVM worker-thread stack overflow from large per-task adstack budget (Genesis-Embodied-AI#495) * [AutoDiff] Autodiff 10: Implement adstack for SPIR-V (Genesis-Embodied-AI#490) * [AutoDiff] Autodiff 11: Latent adstack-adjacent fixes (AMDGPU hipFree, flush() keeps ctx_buffers_, always-preallocate) (Genesis-Embodied-AI#536) * [Doc] Add AGENTS.md with instructions for AI agents (Genesis-Embodied-AI#541) * [Bug] Abort kernel execution on assertion failure instead of segfaulting (Genesis-Embodied-AI#419) * [Type] ndarray typing 1: Add eval_str=True to inspect.signature() calls (Genesis-Embodied-AI#411) * [CI] Suppress reportPrivateImportUsage in torch-using files (Genesis-Embodied-AI#552) * [Misc] QD_DUMP_IR dumps to files with the task_id added to the filename (Genesis-Embodied-AI#441) * [Type] ndarray typing 2: Fix NDArray single-arg subscript crash (Genesis-Embodied-AI#412) * [Test] Flush xdist channel before worker exit so test failure reports are visible (Genesis-Embodied-AI#555) * [CI] Reduce test retries on CI from 3 to 1. (Genesis-Embodied-AI#554) * [AutoDiff] Autodiff 12: Heap-backed adstack on LLVM backends (CPU/CUDA/AMDGPU) (Genesis-Embodied-AI#537) * [AutoDiff] Autodiff 13: Heap-backed adstack on SPIR-V backends (Metal, Vulkan) (Genesis-Embodied-AI#493) * [AutoDiff] Autodiff 14: Resolve bounded-inner-loop adstacks without default_ad_stack_size fallback (Genesis-Embodied-AI#539) * [SPIRV] Vulkan SPIR-V correctness: atomic-view aliasing, PSB stride, narrow storage caps, u1 cast, per-init layer recheck (Genesis-Embodied-AI#513) * [Build] Autodiff 15: Replace 2022 MoltenVK pin with LunarG Vulkan SDK fetch and sanitise MoltenVK cap advertisement (Genesis-Embodied-AI#551) * [Test] Suppress stock pytest-timeout to avoid conflict with pytest_hardtle (Genesis-Embodied-AI#557) * [Vulkan] Use SDK validation layer for debugPrintf instead of apt package (Genesis-Embodied-AI#562) * [Test] Fix flaky perf_dispatch tests by increasing work amounts (Genesis-Embodied-AI#559) * [Test] Add --maxfail CLI option to run_tests.py (default 20) (Genesis-Embodied-AI#558) * [CI] Vulkan debug printf fix to address flaky tests (Genesis-Embodied-AI#563) * [Docs] Add a new page to help for first time contributors (Genesis-Embodied-AI#426) Authored-by: v01dxyz <v01dxyz@v01d.xyz> * [AutoDiff] Autodiff 16: Resolve reverse-mode adstack depths per-launch via runtime-evaluated SizeExpr (Genesis-Embodied-AI#543) * Fix: raise error if device memory allocation fails (Genesis-Embodied-AI#451) (Genesis-Embodied-AI#453) Co-authored-by: v01dxyz <v01dxyz@v01d.xyz> Co-authored-by: Hugh Perkins <hughperkins@gmail.com> * [CI] Add CI job to check line wrapping of comments and docs (Genesis-Embodied-AI#564) * [Misc] Add coverage report to PRs, including kernels (Genesis-Embodied-AI#470) * [CI] CI wrap check feeds only diffs to agent (Genesis-Embodied-AI#567) * Skip 'flaky' test on MacOS CI. (Genesis-Embodied-AI#573) * [Test] Fix missing `import sys` in test_fail_device_memory_allocation (Genesis-Embodied-AI#574) * [CI] Fix Vulkan debugPrintf flake with session-scoped warmup (Genesis-Embodied-AI#571) * [AutoDiff] determine_ad_stack_size: replace whole-CFG Bellman-Ford with SCC + DAG DP (Genesis-Embodied-AI#575) * [Test] Fix macOS OOM skip reason to describe actual root cause (Genesis-Embodied-AI#576) * [Lang] whole_kernel_cse: 2.5x compile time speedup on large kernels (Genesis-Embodied-AI#577) * [CI] Add CI check for unnecessarily deleted comments (Genesis-Embodied-AI#570) * [CI] Migrate coverage report to github Check page (Genesis-Embodied-AI#566) * [Lang] Skip IR verifier between passes unless debug=true (Genesis-Embodied-AI#579) * [Lang] Inline AdStack ops on release LLVM codegen: dramatically reduces compile time for adstack-enabled reverse-mode kernels (Genesis-Embodied-AI#584) * [CUDA] Honor offline_cache=False end-to-end so QD_OFFLINE_CACHE=0 actually gives a cold compile (Genesis-Embodied-AI#580) * [Type] Tensor 24 (Genesis-Embodied-AI#561) Co-authored-by: hugh <hugh@slurm-login-0.slurm-login.tenant-slurm.svc.cluster.local> * [Lang] auto_diff host-walk reductions: dramatically faster front-end compile time on adstack-enabled reverse-mode kernels (Genesis-Embodied-AI#587) * [AutoDiff] Speed up reverse-mode kernel launches on GPU backends (Genesis-Embodied-AI#578) * [Vulkan] Move adstack-sizer scratch out of Function-scope memory to fix SPIR-V pipeline build failures (Genesis-Embodied-AI#588) * [AutoDiff] Improve diagnosis of unsupported reverse-mode AD patterns (Genesis-Embodied-AI#590) * [Bug] Fix: promote Ndarray to AnyArray in build_Name for flattened struct fields (Genesis-Embodied-AI#592) * [SPIR-V] Shrink reverse-grad kernel MSL by ~50% (Genesis-Embodied-AI#591) * [CI] Add CI check that PR changes have test coverage (Genesis-Embodied-AI#596) * [Perf] Enable zero-copy in to_torch() and to_numpy() (Genesis-Embodied-AI#450) * Add BufferView: safe sub-range ndarray access for kernels (Genesis-Embodied-AI#585) Co-authored-by: alanray-tech <alanray-tech@users.noreply.github.com> Co-authored-by: Hugh Perkins <hughperkins@gmail.com> * [Doc] Add user-facing fastcache documentation (Genesis-Embodied-AI#597) Co-authored-by: hugh <hugh@slurm-login-0.slurm-login.tenant-slurm.svc.cluster.local> * [Misc] Upgrade to enable v1 dlpack so to_numpy(copy=False) writable (Genesis-Embodied-AI#598) Co-authored-by: root <root@rtx-209-201.slurm-compute.tenant-slurm.svc.cluster.local> * [AutoDiff] Cut reverse-mode adstack memory usage 10x on all backends (Genesis-Embodied-AI#599) * [Misc] Add CI check for feature file factorization (Genesis-Embodied-AI#606) * [Perf] Skip _recursive_set_args for all-Field frozen dataclass structs (Genesis-Embodied-AI#607) Co-authored-by: Cursor <cursoragent@cursor.com> * [AutoDiff] SNode-arm bound-expr capture rejects fold-attack gate indices (Genesis-Embodied-AI#610) * [Misc] Suppress field fastcache warning for qd.Tensor (Genesis-Embodied-AI#615) Co-authored-by: Cursor <cursoragent@cursor.com> * [AutoDiff] Adstack heap: clip reducer count by per-task loop trip count (compile-time and SizeExpr-evaluated) (Genesis-Embodied-AI#611) * [Misc] Forward copy= through qd.Tensor, add copy=None option (Genesis-Embodied-AI#616) Co-authored-by: Cursor <cursoragent@cursor.com> * [Doc] Update README (Genesis-Embodied-AI#617) Co-authored-by: Cursor <cursoragent@cursor.com> * [CI] Fix coverage report showing def lines as uncovered (Genesis-Embodied-AI#623) Co-authored-by: Cursor <cursoragent@cursor.com> * [Perf] Generic launcher: persistent context, JIT-pointer reuse, Metal compute encoder, LLVM-GPU async memory ops (Part 1/2) (Genesis-Embodied-AI#619) * [CI] Encode Python-first testing policy in coverage-check prompt (Genesis-Embodied-AI#622) Co-authored-by: Cursor <cursoragent@cursor.com> * [CI] Add PR Line change report (Genesis-Embodied-AI#624) Co-authored-by: Cursor <cursoragent@cursor.com> * [CI] Disable quadrants pytest plugin during quadrants internal coverage runs (Genesis-Embodied-AI#629) Co-authored-by: Cursor <cursoragent@cursor.com> * [AutoDiff] Adstack load+store eliminations: EliminateRecomputableAdStackPushes pass + leaf extensions (Genesis-Embodied-AI#621) * [CI] Simplify coverage PR comment to a single linked line (Genesis-Embodied-AI#630) * [CUDA] Add AGX Thor, SM_110 (Genesis-Embodied-AI#631) Co-authored-by: Johnny Nunez and Hugh Perkins * [CI] Lines changed report: collapse PR comment to a single linked totals line (Genesis-Embodied-AI#632) * [FEATURE] Support external Metal command queue via qd.init (Genesis-Embodied-AI#618) Co-authored-by: Cursor <cursoragent@cursor.com> * [Perf] Cache adstack-sizer metadata per task across SPIR-V + LLVM-GPU; per-snode / DeviceAllocation invalidation (Part 2/2) (Genesis-Embodied-AI#620) * [AutoDiff] Disable EliminateRecomputableAdStackPushes pending mutated-SNode chain-leaf fix (Genesis-Embodied-AI#633) * [AutoDiff] Adstack chain-clone safety: mutated-SNode leaf reject + load_top consumer-aware guard (Genesis-Embodied-AI#634) * [Docs] Add user-guide page for qd.simt.block.* primitives (Genesis-Embodied-AI#638) * [Docs] Expand qd.simt.subgroup user-guide page to cover every op (Genesis-Embodied-AI#639) * [Perf] Streams 1-4 (Genesis-Embodied-AI#410) * [Docs] Add user-guide page for matrix decompositions and solvers (Genesis-Embodied-AI#643) * [Bug] Revert "[Perf] Streams 1-4 (Genesis-Embodied-AI#410)" (Genesis-Embodied-AI#650) * [Docs] Add user-guide page for atomics and bit operations (Genesis-Embodied-AI#640) * [Docs] Add user-guide page for qd.simt.grid.* primitives (Genesis-Embodied-AI#641) * [AutoDiff] Adstack max-reducer: parallel multi-axis MaxOverRange dispatch (Genesis-Embodied-AI#635) * [AMDGPU] Fix amdgpu parallel rand init (Genesis-Embodied-AI#658) * [Perf] Adstack: skip max-reducer recognizer on CPU + lift host-eval cap (Genesis-Embodied-AI#655) * [Perf] Re-land Streams 1-4 with bug fixes (Genesis-Embodied-AI#653) * [AMDGPU] Apply device_memory_GB=0.3 cap to AMDGPU tests (Genesis-Embodied-AI#659) * [Perf] Per-launch host sync: drop wait_idle on SPIR-V, pin stream and drop stream_synchronize on CUDA/AMDGPU (Genesis-Embodied-AI#654) * [AMDGPU] Unload hipModule_t in JITModuleAMDGPU destructor (Genesis-Embodied-AI#660) * [AMDGPU] Trim default mempool on qd.reset() (Genesis-Embodied-AI#669) * [AMDGPU] Hoist rand-state buffer to process lifetime (Genesis-Embodied-AI#668) * [Streams] Use events for streams serialization on AMDGPU and CUDA (Genesis-Embodied-AI#667) * [Perf] Adstack max-reducer: launch cache + zero-copy result map; content-stable registry_id (Genesis-Embodied-AI#671) * [SPIR-V] dispatch_max_reducers: register each task with the real kernel name (Genesis-Embodied-AI#675) * [AutoDiff] Debug-mode field/grad/dual: dtype, layout, and access-time invariants (Genesis-Embodied-AI#677) * [Docs] Add user-guide page for qd.algorithms.* device-wide algorithms (Genesis-Embodied-AI#642) Co-authored-by: alanray-tech <alan.ray@genesis-ai.company> * [Docs] Doc for existing atomics: switch support table to per-backend columns (Genesis-Embodied-AI#657) Co-authored-by: alanray-tech <alan.ray@genesis-ai.company> * [GPU] Cross gpu atomics (Genesis-Embodied-AI#666) Co-authored-by: alanray-tech <alan.ray@genesis-ai.company> * [GPU] Make block operations portable cross-gpu (Genesis-Embodied-AI#664) * [Perf] CPU LLVM adstack-cache: skip per-launch bump-writes + ndarray_shapes capture on forward-only handles (Genesis-Embodied-AI#685) * [GPU] Cross-GPU for grid ops (Genesis-Embodied-AI#670) * [Math] Make bitop operations portable cross-gpu (Genesis-Embodied-AI#662) * [AMDGPU] Always use wave64, on both RDNA and CDNA (Genesis-Embodied-AI#687) * [AMDGPU] Use syncscope("agent") for atomix xor to avoid CAS livelock (Genesis-Embodied-AI#672) * [GPU] New bit ops for QIPC (Genesis-Embodied-AI#679) * [GPU] Subgroup ops cross-gpu (Genesis-Embodied-AI#665) * [Graph] Rename CUDA Graph to Graph in docs (Genesis-Embodied-AI#691) * [SPIR-V] Fix FIFO-queue ordering when sharing command queue. (Genesis-Embodied-AI#694) * [Atomics] New QIPC ops for atomics (Genesis-Embodied-AI#690) * Pass dataclass sub-structs into qd.func (Genesis-Embodied-AI#698) * [AMDGPU] HIP graph runtime support for @qd.kernel(graph=True) (Genesis-Embodied-AI#692) * [CI] Add per-file timing report to Mac Metal test job (Genesis-Embodied-AI#695) Co-authored-by: Cursor <cursoragent@cursor.com> * [CI] Enable kernel disk cache during tests (Genesis-Embodied-AI#696) * [Math] New QIPC ops for single-threaded linalg (Genesis-Embodied-AI#683) * [BREAKING][GPU] New QIPC ops for subgroups (Genesis-Embodied-AI#676) * [GPU] New QIPC ops for block (Genesis-Embodied-AI#684) * [GPU] New device-level ops for QIPC (Genesis-Embodied-AI#693) * [algorithms] PrefixSumExecutor: drop unused GRID_SZ local (Genesis-Embodied-AI#701) * [block] sync(): fix unsupported-arch error message (Genesis-Embodied-AI#700) * [volatile_load] add qd.volatile_load primitive (closes Genesis-Embodied-AI#648) (Genesis-Embodied-AI#702) * [AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info (Genesis-Embodied-AI#708) * [Vulkan] Declare GroupNonUniform SPIR-V caps and enable shaderSubgroupExtendedTypes (Genesis-Embodied-AI#707) * Fix duplicate HIP graph driver-function declarations after v1.0.0 merge The amd-integration fork had cherry-picked the HIP graph driver functions (graph_create / graph_destroy / graph_add_kernel_node / graph_instantiate / graph_exec_destroy / graph_launch), and upstream v1.0.0 added the same set. The per-file 3-way merge appended both copies into amdgpu_driver_functions.inc.h, producing redeclaration errors that broke the AMDGPU RHI/runtime compile. Drop the upstream duplicate block; the signatures are identical to the fork's existing declarations. Co-authored-by: Cursor <cursoragent@cursor.com> * Fix AMDGPU launcher coherence and num_instructions visibility after v1.0.0 merge - kernel_launcher.cpp: the 3-way merge spliced upstream v1.0.0's launch_llvm_kernel rewrite (ephemeral arg/context buffers, explicit-stream path, AmdgpuDefaultStream PinGuard) onto the AMD fork's kernarg-by-value + persistent-scratch design, leaving references to undefined `ephemeral_context_ptr`. Restore the fork's coherent launch_llvm_kernel verbatim; it calls the (already merged) enhanced launch_offloaded_tasks, which keeps the max-reducer dispatch and stream-parallel groups adapted onto the AMD launch path. - llvm_context.h: both the fork and upstream added `num_instructions`; the merge kept upstream's private placement, but the AMDGPU codegen force-inline heuristic calls it statically from outside the class. Move it back to the public section. Co-authored-by: Cursor <cursoragent@cursor.com> * Restore async result D2H and hoist kernarg vectors in AMDGPU launcher The v1.0.0 merge resolution regressed two amd-integration baseline optimizations in launch_llvm_kernel / launch_offloaded_tasks: - The per-launch result-buffer copy was a blocking memcpy_device_to_host, forcing a host stall on every value-returning launch and serializing the GPU pipeline. Restore the async D2H (the caller synchronizes lazily when it needs the value); external-array transfers still stream_synchronize once before reading back. - launch_task constructed the kernarg std::vectors from initializer lists ({kernarg_payload} / {kernarg_size}) on every dispatch (heap alloc + free per launch). Hoist arg_ptrs/arg_sizes out of the per-task launch and reuse. Co-authored-by: Cursor <cursoragent@cursor.com> * amdgpu: default to LDS permlane64 emulation; drop host-x86 barrier asm on retarget Two AMDGPU JIT-compile crashes surfaced after the v1.0.0 merge pulled in the QIPC subgroup ops (Genesis-Embodied-AI#676), which made the rigid constraint solver's wave-cooperative reductions route through `amdgpu_cross_half_shuffle_i32`. Both manifested as a SIGSEGV inside `llvm::SIInstrInfo::getInstSizeInBytes` during `JITSessionAMDGPU::compile_module_to_hsaco` (i.e. at first kernel launch), and reproduce on gfx942 / MI300X. Baseline 0.4.6 never emitted these constructs, which is why it was unaffected. 1. Native `llvm.amdgcn.permlane64` lowering crashes the bundled LLVM 22.1.0 AMDGPU backend. Default `amdgpu_permlane64` to the existing LDS-roundtrip software emulation on every target (it produces identical results). Add `QD_AMDGPU_USE_NATIVE_PERMLANE64=1` to opt back into the native instruction once the backend bug is fixed; the old `QD_AMDGPU_FORCE_PERMLANE64_FALLBACK` is now the default and still honored. This is the actual crash fix. 2. The runtime module is compiled by the host x86_64 clang and only retargeted to amdgcn here, so `amdgpu_cross_half_shuffle_i32`'s `__asm__ volatile("" : "+v"(byte))` optimization barrier carries x86 flag clobbers (`~{dirflag},~{fpsr},~{flags}`) that are meaningless on AMDGPU. The IR verifies but the empty-body INLINEASM is invalid on the amdgcn target. Neutralize empty-body barrier asm during retarget (forward the tied value, then erase) so no stale host asm reaches codegen. On the wave64 targets we ship `ds_bpermute` already addresses the full wave, so the hint is a no-op. Co-authored-by: Cursor <cursoragent@cursor.com> * style: apply clang-format (v19.1.7) to AMDGPU fn_attrs and launcher sources CI pre-commit's clang-format hook reformatted these files (long declarations/lambda signatures collapsed onto single lines per the repo's clang-format config). Apply the same formatting so the hook passes. No functional changes. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(amdgpu): use CreateNeg for branchless i32 sgn instead of CreateSub(0, input) clang-tidy (modernize-use-nullptr, -warnings-as-errors) flagged `builder->CreateSub(0, input)` in the i32 sgn path: the literal `0` binds to the `llvm::Value*` LHS parameter as a null pointer, not an integer zero. Replace with `builder->CreateNeg(input)`, which emits `0 - input` with a proper zero constant -- identical intended semantics, and clang-tidy clean. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Robert Dazi <14996868+v01dXYZ@users.noreply.github.com> Co-authored-by: v01dxyz <v01dxyz@v01d.xyz> Co-authored-by: Hugh Perkins <hughperkins@gmail.com> Co-authored-by: Alexis DUBURCQ <alexis.duburcq@gmail.com> Co-authored-by: hugh <hugh@slurm-login-0.slurm-login.tenant-slurm.svc.cluster.local> Co-authored-by: alanray-tech <alan.ray@genesis-ai.company> Co-authored-by: alanray-tech <alanray-tech@users.noreply.github.com> Co-authored-by: root <root@rtx-209-201.slurm-compute.tenant-slurm.svc.cluster.local> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Johnny <johnnynuca14@gmail.com>

hughperkins added 21 commits April 10, 2026 09:06

[CI] Add test coverage reporting as PR comments

773110d

Enable pytest-cov in Linux CI and post coverage summary as a PR comment via MishaKav/pytest-coverage-comment. Uses only the built-in GITHUB_TOKEN — no external service or org permissions.

[CI] Add diff coverage reporting on PR comments

05732ca

Use diff-cover to report percentage of changed/added lines that are covered by tests. Posts a sticky PR comment with both diff coverage and overall project coverage. No external services needed — uses only the built-in GITHUB_TOKEN.

[CI] Add diff coverage gate at 80% for changed lines

bd993fa

Fail the Linux CI if less than 80% of changed/added Python lines are covered by tests. The coverage PR comment is posted before the check so numbers are always visible.

[CI] Fix diff-cover: use --format markdown instead of --md-report

ac86607

[CI] Fix diff-cover format flags

580738b

Use --format markdown:file.md syntax instead of --format markdown with stdout redirect.

[CI] Exclude JIT-compiled kernel code from coverage

331b31a

Code inside @qd.func / @qd.kernel (and @ti. variants) is JIT-compiled to GPU code, so Python coverage.py can never trace it. Exclude these blocks to avoid false-negative coverage on kernel-heavy files.

[Test] Run kernel coverage e2e tests on both CPU and CUDA

dd0e850

Fix stale coverage field after qd.init() re-initialization

5e13cee

Track which Program instance the coverage field belongs to. Re-allocate after qd.init() destroys the old SNode tree, preventing dangling field references with garbage dimensions.

Fix off-by-one in AST rewriter unit test expectations

4406489

start_lineno=10 + relative line 2 - 1 = 11, not 12.

Add simt e2e test with block.sync() and subgroup.shuffle

d47004c

Uses portable subgroup.shuffle instead of CUDA-specific warp.shfl, and qd.gpu arch so it runs on both CUDA and Vulkan.

Test runtime-branched subgroup shuffle with coverage probes

fd27142

The kernel reads flag[0] (a runtime field value) to choose between two shuffle paths, verifying that coverage correctly tracks which branch executed and which didn't.

Fix simt test arch filter: use [qd.cuda, qd.vulkan] not [qd.gpu]

565da3d

qd.gpu doesn't match QD_WANTED_ARCHS=cuda filtering in test_utils.

Fix simt test: use arch=qd.gpu (already a list), not arch=[qd.gpu]

eadc831

Merge branch 'hp/kernel-coverage' into hp/pr-coverage-w-kernels

795b3b5

hughperkins commented Apr 10, 2026

View reviewed changes

Comment thread python/quadrants/lang/ast/ast_transformer_utils.py Outdated

hughperkins added 5 commits April 10, 2026 16:10

Fix formatting and lint: black, ruff imports, pylint disable

b480a06

Remove accidentally committed diff-cover.html artifact

470fb92

Add coverage artifacts to .gitignore

68c39fb

Only import _kernel_coverage when QD_KERNEL_COVERAGE=1

98a17d5

Guard the import behind the env var check at call sites so the module is never loaded in normal operation. This lets _kernel_coverage use top-level imports instead of scattered lazy imports.

Suppress pyright import error for optional coverage dependency

d9482bb

hughperkins commented Apr 11, 2026

View reviewed changes

Comment thread .gitignore

/python/test_env

/CHANGELOG.md

/.coverage

/.coverage.*

hughperkins Apr 11, 2026

Copy link
Copy Markdown

Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

hughperkins commented Apr 11, 2026

View reviewed changes

claude Bot reviewed Apr 23, 2026

View reviewed changes

hughperkins added 3 commits April 23, 2026 14:48

Fix CUDA CI: pass short filename to run_tests.py

d6d50e2

run_tests.py auto-prepends "test_" to filenames that don't start with it, so passing the full path "tests/python/test_kernel_coverage.py" was mangled to "test_tests/python/test_kernel_coverage.py". Made-with: Cursor

fix: include QD_KERNEL_COVERAGE in fastcache key

d3e4bad

Without this, toggling coverage between runs serves a stale cached kernel: coverage-off cached kernel gives 0% coverage when re-enabled, and coverage-on cached kernel causes NameError on _qd_cov when disabled.

claude Bot reviewed Apr 23, 2026

View reviewed changes

hughperkins added the ok-to-merge label Apr 23, 2026

Merge origin/main into hp/pr-coverage-w-kernels (resolve conflict in …

245de20

…4_test.sh) Made-with: Cursor

hughperkins removed the ok-to-merge label Apr 24, 2026

hughperkins added 3 commits April 24, 2026 13:51

Unwrap hard-wrapped lines in kernel_coverage.md

b80cb55

Merge branch 'main' into hp/pr-coverage-w-kernels

f4052b1

Rewrap code comments at 120 chars instead of ~80

3c3191a

erizmr approved these changes Apr 24, 2026

View reviewed changes

duburcqa reviewed Apr 24, 2026

View reviewed changes

Comment thread tests/python/test_intrinsics.py

duburcqa Apr 24, 2026

Copy link
Copy Markdown

Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated.

duburcqa approved these changes Apr 24, 2026

View reviewed changes

hughperkins added 2 commits April 24, 2026 14:57

Revert test_intrinsics.py changes to match origin/main

920b379

Merge branch 'main' into hp/pr-coverage-w-kernels

fa460fe

claude Bot reviewed Apr 24, 2026

View reviewed changes

Merge branch 'main' into hp/pr-coverage-w-kernels

786e1dc

hughperkins merged commit 2aae667 into main Apr 25, 2026
21 of 22 checks passed

hughperkins deleted the hp/pr-coverage-w-kernels branch April 25, 2026 19:55

Process	Compile order	probe_ids for K1	probe_ids for K2	`_probe_map`
A	K1, K2	0, 1, 2	3, 4, 5	0→L10, 1→L11, 2→L12, 3→L20, 4→L21, 5→L22
B	K2, K1	3, 4, 5	0, 1, 2	0→L20, 1→L21, 2→L22, 3→L10, 4→L11, 5→L12

		cov_branch = getattr(config.option, "cov_branch", False) or False
		os.environ["_QD_KCOV_ARC"] = "1" if cov_branch else "0"

Uh oh!

Conversation

hughperkins commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Brief Summary

Summary

What's included

Strengths

Weaknesses / known limitations

Walkthrough

Uh oh!

hughperkins Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hughperkins Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins commented Apr 23, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Uh oh!

hughperkins commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Coverage Report (49f8dea62)

Uh oh!

hughperkins commented Apr 23, 2026

Uh oh!

claude Bot Apr 23, 2026

Choose a reason for hiding this comment

What the bug is and how it manifests

The specific code path that triggers it

Step-by-step proof with a concrete example

Why existing code does not prevent it

What the impact would be

How to fix it

Uh oh!

github-actions Bot commented Apr 23, 2026

Coverage Report (6081d31d0)

Uh oh!

hughperkins commented Apr 23, 2026

so basically, no need to do anything, and good to merge now?

Uh oh!

github-actions Bot commented Apr 24, 2026

Coverage Report (50e1f8ad6)

Uh oh!

erizmr left a comment

Choose a reason for hiding this comment

Uh oh!

duburcqa Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 24, 2026

Choose a reason for hiding this comment

What the bug is and how it manifests

The specific code path that triggers it

Empirical verification

Why existing code does not prevent it

Impact

Step-by-step proof

How to fix it

Uh oh!

github-actions Bot commented Apr 24, 2026

Coverage Report (40a0ca011)

Uh oh!

Uh oh!

hughperkins commented Apr 10, 2026 •

edited

Loading

Coverage Report (`49f8dea62`)

Coverage Report (`6081d31d0`)

Coverage Report (`50e1f8ad6`)

Coverage Report (`40a0ca011`)

Coverage Report (`2b7d0561d`)