refactor(wasm): R2 — parse_wasm_config serde reduction (#689 follow-up) by bashandbone · Pull Request #697 · marquetools/marque

bashandbone · 2026-05-22T19:05:16Z

Measurements

metric	pre (`origin/staging` `7af85fb2`)	post (this branch `b974550b`)	delta
WASM pre-opt (`--profile release-web` — CI-gate-comparable)	1,372,062 B	1,364,312 B	−7,750 B (−0.56%)
WASM pre-opt (`--release`, agent's working measurement)	1,372,791 B	1,365,032 B	−7,759 B (−0.57%)
`lint_10kb` p95	428.08 µs	432.21 µs	+0.45% — Criterion p=0.88, no statistically significant change
`cargo test --workspace`	green	green	—
`cargo +stable clippy --workspace --all-targets --all-features -- -D warnings`	green	green	—

Above the synthesis-brief gate (≥4 KB net savings). Slightly under the upper end of the preflight 5–8 KB estimate band, consistent with the linker-dedup uncertainty the preflight flagged. Bench is hot-path-neutral as expected (config parsing happens once per lint_native/fix_native call, not on the engine hot path).

Methodology note: wasm-pack 0.14 on this dev environment doesn't apply [package.metadata.wasm-pack.profile.release-web] wasm-opt flag lists reliably; the agent's working measurement used --release for build-stability. The CI-gate-comparable number (--profile release-web, top row above) confirms the delta independently to within 9 bytes — the relative delta is the trustworthy signal. Same caveat as documented in tools/wasm-size-check.sh.

Summary

Retires #[derive(Deserialize)] from WasmConfig and #[derive(Serialize)] from WasmConfigCacheKey<'a> (latter deleted entirely) — the two derives that together accounted for ~10 KB of WASM bundle bloat per perf(wasm): investigate +5.18% WASM binary regression vs. baseline (cumulative across staging window) #689 attribution.
Replaces them with explicit serde_json::Value::as_object().get(...) extraction via private wasm_config_from_value, plus direct serde_json::Map construction in build_cache_key. serde_json::Value is already in the WASM bundle (used by Diagnostic JSON building, Map<String, Value> for FixIntent-style args, RawValue for diagnostics), so the swap adds no new monomorphization.
TDD-first ordering: 19 byte-identity + type-mismatch + Constitution-III tests written and passing on pre-refactor code in commit 1 (fcd5ebca), refactor lands in commit 2 (2ad85eaa). One f32 byte-identity gotcha caught by the tests before reaching review.

Constitution III check — runtime-config does not expand engine's semantic surface

Constitution III §"runtime configuration that expands the engine's semantic surface" is the load-bearing constraint on WasmConfig. R2 PRESERVES and arguably TIGHTENS the surface:

property	pre (Derive)	post (R2)
Closed accept-list of `{classifier_id, confidence_threshold, corrections, deadline_ms}`	implicit via struct field list	explicit `obj.get(...)` enumeration at call site
Unknown JSON fields silently ignored	yes (struct fields are the implicit accept-set)	yes (any name not in the explicit `obj.get(...)` list is ignored) — pinned by `r2_unknown_field_silently_ignored` + `r2_unknown_field_coexists_with_known` unit tests
`classifier_id` accepts any serde-Deserializable type	yes (Derive on `Option<String>` accepts string / null)	NO — only `Value::String` accepted; everything else is a loud error. Tightening, not widening.
`null` JSON value treated as absent	yes (`#[serde(default)]` + `Option<T>`)	yes (`None

The Constitution III invariant is preserved. The tightening on classifier_id (and the other three fields) is a strict narrowing of what gets accepted — a JS caller passing a number for classifier_id now gets a loud error instead of being silently coerced.

f32 byte-identity gotcha

serde_json::Value::Number is internally f64-only. Naive replacement Number::from_f64(threshold as f64) widens the f32 bit-pattern to f64 BEFORE the shortest-roundtrip serializer formats it — for 0.85_f32 (which is not exactly representable as binary32) this produces "0.8500000238418579" instead of the f32 shortest-roundtrip "0.85" that serialize_f32 would emit. The discrepancy breaks cache-key byte-identity with the retired WasmConfigCacheKey #[derive(Serialize)] path.

Fix routes f32 through serde_json::to_string(&threshold) (which uses serialize_f32 / shortest-f32-roundtrip) then parses the resulting string back into a Value::Number. The roundtrip is safe for any finite f32 and cheap (one short alloc + one parse of a known-valid numeric literal) on engine-cache-miss only.

The byte-identity test r2_byte_identity_threshold_only exercises 0.85 specifically — a value that's NOT exactly representable as f32, so it catches a regression to the naive from_f64 path. The fix and rationale are documented in code comments at lines 1083–1097.

Behavior changes — errors from the Derive surface

input class	pre (Derive)	post (R2)
Malformed JSON (`{...,`)	`Err` with line/column context	`Err` with line/column context (outer `serde_json::from_str::<Value>(s)` unchanged)
Wrong type on a known field (`classifier_id: 123`)	`Err` with line/column, generic shape error	`Err` with field name + type hint (e.g., `"classifier_id must be a string; got number"`) — no column
Wrong type on a `corrections` value (`{"k": 42}`)	`Err` with line/column	`Err` with `corrections[k]` key + type hint
Unknown field (`{"some_other_field": 42}`)	silently ignored	silently ignored (preserved)
Top-level non-object (`"hello"`, `[1,2]`, `42`)	`Err` with shape error	`Err` containing `"must be a JSON object"` + type hint

Net trade: gain field names in errors (better diagnostics for JS callers), lose column numbers for field-level errors (acceptable — JS callers get column numbers from the outer parse for malformed JSON, which is the more common error class).

Adjacent-bug survey

BatchEntry #[derive(Deserialize)] at lib.rs:811 — deferred per synthesis brief. Two plain String fields (no Option, no HashMap); expected savings <1 KB. Unlikely to clear the 4 KB gate solo; will file as R2b only if combinable with other small cuts.
~18 OUTPUT-side #[derive(Serialize)] impls (DiagnosticJson, FixResultJson, RuleIdJson, FixJson, AuditConfidenceJsonV1_0, DeadlineExceededBodyJson, etc., lines 284-1403) — these are the JS-API wire contract per contracts/diagnostic.json and contracts/audit-record.json. EXPLICITLY out of scope for the R-series. Touching them is a contract-break PR requiring schema-version bump coordination.
serde_json::Map BTreeMap-backed assumption — the workspace's serde_json = "1.0.149" is pinned without preserve_order, so Map is alphabetical-iterating. WasmConfig field names happen to be alphabetical, which coincides with the struct-declaration order the prior Derive used → byte-identity preserved without preserve_order. The WasmConfig struct doc-comment + build_cache_key doc-comment + the byte-identity regression tests triangulate this constraint for future maintainers.

Constitution checks

VII §IV (scheme-adoption boundary): ✅ N/A — marque-wasm sits ABOVE the rule chain. This PR touches crates/wasm/src/lib.rs + claudedocs/wasm-bundle-analysis-2026-05-22.md only. Zero edits to engine, scheme, rules, core, ism, capco.
VIII (citation fidelity): ✅ No new § citations added.
G13 (audit-content-ignorance): ✅ N/A — config parsing is engine-INPUT, not audit-OUTPUT.
Principle I (Performance): ✅ lint_10kb Criterion p=0.88 — no statistically significant change. SC-001 16 ms ceiling 50× headroom intact.
III (runtime configuration / engine semantic surface): ✅ PRESERVED + TIGHTENED. See Constitution III check section above.

Test plan

Review-pass changes (commit `b974550b`)

Three reviewer nits addressed in the final fix-up commit:

MEDIUM (rust-specialist) — expect("corrections present") at build_cache_key line 1100 → structurally self-proving if let Some(corrections) = cfg.corrections.as_ref().filter(|c| !c.is_empty()). Same semantic; Some-and-non-empty proof now visible at the binding site.
LOW (general code-reviewer) — WasmConfig struct doc-comment now documents the alphabetical-field-naming constraint with a cross-reference to build_cache_key. A developer adding a new field encounters the constraint upfront rather than discovering it via byte-identity test failure.
LOW (general code-reviewer) — measurement methodology discrepancy (--release vs --profile release-web) reconciled in the PR body's Measurements section above, with both numbers shown side-by-side and the wasm-pack 0.14 limitation noted.

Declined: pinning field-name presence in the 8 type-mismatch test error strings (LOW per general reviewer; flagged as judgment call). The boundary the tests pin is loud-vs-silent, which is the API contract; field-name quality is implementation-level. Pinning it would lock the present wording, which we explicitly accepted dropping serde's column-number context for.

References

closes-part-of perf(wasm): investigate +5.18% WASM binary regression vs. baseline (cumulative across staging window) #689 — WASM regression investigation series
refs refactor(capco): R1 — sort-consolidation WASM cut (#689 follow-up) #696 — R1 precedent (sort-consolidation WASM cut, same investigation)
refs tools/wasm-monoaudit.sh — twiggy monos returns Σ [0 Total Rows] (DWARF not preserved) #692 — companion tooling fix for tools/wasm-monoaudit.sh DWARF preservation

…onfig (#689) TDD step 1 of the R2 serde-reduction refactor. Adds 19 tests to the existing in-module `#[cfg(test)] mod tests` block in `crates/wasm/src/lib.rs` that pin behavior of `parse_wasm_config` before the `#[derive(Deserialize)]` removal: - 9 byte-identity cache-key fixtures capturing exact emission of the current `#[derive(Serialize)] WasmConfigCacheKey` path: None / empty object / empty corrections / deadline-only all map to cache_key=None; classifier_id / threshold / corrections / all-three / deadline+ classifier produce literal byte strings in struct-declaration order (which coincides with alphabetical here — alphabetical pins the shape, struct-declaration order will pin a future field insertion that breaks alphabetical). - 8 type-mismatch loud-failure tests asserting `is_err()` for non- object top-level inputs and field-level type mismatches. Pre-refactor these produce serde-derived error messages; post-refactor they will produce field-name-tagged messages. Tests assert only the loud-vs- silent boundary, not exact wording — both versions pass. - 2 Constitution III preservation tests pinning the silently-ignore- unknown-fields contract. The current `#[derive(Deserialize)]` has no `deny_unknown_fields`; the post-refactor `obj.get(...)` enumeration matches that behavior. WASM target's runtime-config surface MUST NOT widen — these tests catch a regression where a future hand- rolled parser accidentally rejects unknown fields and would block forward-compatible JS callers. All 19 tests pass on `origin/staging` HEAD (pre-refactor). Captured baseline strings via a standalone serde-derive reproducer at authorship; the in-test assertions are the live regression gate. Helpers stay private per the R2 synthesis brief's helper-visibility decision — tests live alongside the existing build_cache_key / parse_deadline_ms tests in `mod tests`, not under `crates/wasm/tests/`, because the integration-test directory has no access to `parse_wasm_config` / `build_cache_key` (intentional encapsulation). Refs #689 · refs #696 (R1 precedent)

…ey Serialize (#689) Replaces the `#[derive(Deserialize)] WasmConfig` and `#[derive(Serialize)] WasmConfigCacheKey` paths with explicit `serde_json::Value::as_object().get(...)` extraction and direct `serde_json::Map` construction in `crates/wasm/src/lib.rs`. The retired derives were the largest single contributor to `parse_wasm_config`'s twiggy-attributed symbol size after R1 (`HashMap<String, String>` Deserialize monomorphization ~3-4 KB plus per-field Visitor scaffolding ~3 KB plus the cache-key Serialize emission ~1.5 KB). Why an extraction rewrite and not a smaller-knob tweak: - `#[derive(Deserialize)]` on `WasmConfig` monomorphizes a full visitor + per-field __Field enum + a HashMap<String, String> deserializer, each of which is dead-weight code in a WASM build. The four-field accept-list (`classifier_id`, `confidence_threshold`, `corrections`, `deadline_ms`) is a closed contract — enumerating it at the call site via `obj.get(...)` is structurally narrower than the derive (the derive would have happily accepted any serde- deserializable type for a field named `classifier_id`), AND smaller after the linker is done. - `#[derive(Serialize)]` on `WasmConfigCacheKey` carries a similar per-field SerializeMap driver and skip_serializing_if codegen; the manual `serde_json::Map` construction reaches the same wire format directly. Constitution III preservation (binding constraint): - Closed accept-list of `{classifier_id, confidence_threshold, corrections, deadline_ms}` enumerated explicitly in `wasm_config_from_value`. - Unknown JSON fields are silently ignored — preserves the pre-R2 derive behavior (no `#[serde(deny_unknown_fields)]`). Pinned by the new `r2_unknown_field_silently_ignored` / `r2_unknown_field_coexists_with_known` tests. - Field-level type mismatches are loud errors with field- name context (e.g. `"classifier_id must be a string; got number"`). Net gain in diagnostic quality at the cost of serde's column-number context, which JS callers do not use (the outer `from_str::<Value>` parse still surfaces column numbers for malformed JSON). - JSON `null` continues to mean "absent" for every field (preserves the derive's `Option<T>` semantics). Cache-key byte-identity preservation: - `serde_json::Map` without the `preserve_order` feature is BTreeMap-backed and emits in alphabetical key order. `WasmConfig` field names are alphabetical today (`classifier_id` < `confidence_threshold` < `corrections`), which coincides with the struct-declaration order serde's `#[derive(Serialize)]` used. Byte-identity is pinned by the 9 R2 byte-identity tests added in the previous commit. - A future cache-key field that breaks alphabetical (e.g. inserting `aaa_new_field` between `classifier_id` and `confidence_threshold`) will fail the byte-identity tests and require either renaming the field or enabling `serde_json/preserve_order` and inserting in struct-declaration order. - One non-obvious detail: `serde_json::Value::Number` is f64-only. `Number::from_f64(threshold as f64)` widens before formatting and emits the long-form `"0.8500000238418579"` for the 0.85 bit-pattern, not the shortest-roundtrip `"0.85"` that serde's `serialize_f32` path produced. The R2 byte-identity tests caught this on the first build; the fix routes the f32 through `to_string(&threshold)` (uses `serialize_f32`) and then parses-back via `from_str::<Value>` so the resulting `Number` retains the shortest-f32 string form. Cost is one short alloc + one parse on engine-cache-miss only; cheap relative to the bundle savings. Constitution VII §IV (scheme-adoption boundary): N/A — `marque-wasm` sits above the rule chain. This commit touches ONLY `crates/wasm/src/lib.rs`. Zero edits to engine, scheme, rules, core, ism, or capco. Verified with `git diff --stat HEAD~1`. Net surface changes: - retired: `#[derive(Deserialize)] WasmConfig` - retired: `#[derive(Serialize)] WasmConfigCacheKey` (struct + impl) - added: `fn wasm_config_from_value(Value) -> Result<WasmConfig, String>` - added: `fn value_type_name(&Value) -> &'static str` - rewrote: `fn build_cache_key(&WasmConfig)` using `serde_json::Map` All 37 marque-wasm unit tests pass (19 R2 tests from commit 1 + 18 pre-existing). All 76 integration tests pass across 10 test binaries (113 total). `cargo +stable clippy -p marque-wasm --all-targets -- -D warnings` is clean. BatchEntry's `#[derive(Deserialize)]` (lib.rs:811) is OUT of scope per the R2 synthesis brief; defer as R2b follow-up only if measurable (two String fields, no Option, expected <1 KB). Refs #689 · refs #696 (R1 precedent)

WASM bundle size delta (pre-opt, `wasm-pack --release` with system wasm-opt): pre-R2: 1,372,791 bytes (post-R1 baseline) post-R2: 1,365,032 bytes delta: −7,759 bytes (−0.57%) Above the R2 synthesis-brief 4 KB go/no-go gate. The estimate band in the preflight was 5-8 KB net — the measured delta lands in the lower half of that band, consistent with the preflight's caveat that the dominant `HashMap<String, String>` Deserialize cost is hard to predict exactly because of potential linker deduplication against other String-keyed HashMap deserializers elsewhere in the binary. `lint_10kb` bench (criterion, `cargo bench -p marque-engine --bench lint_latency -- lint_10kb`): pre: [422.35 µs 428.08 µs 434.97 µs] post: [421.49 µs 432.21 µs 447.64 µs] change: +0.45% median, p=0.88 — no statistically significant change. Hot-path neutral, as expected — config parsing is once-per-call, not on the hot path. Constitution Principle I satisfied (SC-001 16 ms ceiling intact; SC-001's threshold is 50× the post-R2 median). Also marks the R2 row in the candidate table as landed and inlines an "f32 byte-identity gotcha" note for future serde-reduction work — `Value::Number` is f64-only, so `Number::from_f64(threshold as f64)` widens before formatting and emits the long-form `"0.8500000238418579"` rather than the shortest-roundtrip `"0.85"` that serde's `serialize_f32` produces. The fix is in commit 2; the note in this commit makes the gotcha findable for the next person. Refs #689 · refs #696 (R1 precedent)

…ons guard + alphabetical-naming constraint cross-reference Two LOW/MEDIUM nits from the 2-reviewer pass on R2 (#689 follow-up, sort-consolidation companion to #696). 1. **rust-specialist nit** — `expect("corrections present")` at `build_cache_key` line 1100 relied on a separately-computed `corrections_present` boolean gate to prove unwrap-safety. Replaced with `if let Some(corrections) = cfg.corrections.as_ref().filter(|c| !c.is_empty())` — same semantic (only emit the `corrections` key when the map is present AND non-empty, matching the retired `#[serde(skip_serializing_if = "Option::is_none")]` shape on `WasmConfigCacheKey`), but the Some-and-non-empty proof is now structurally visible at the binding site rather than relying on a doc-comment to bridge the gap. 2. **general code-reviewer nit** — the `WasmConfig` struct doc-comment described the R2 retirement motive but did not surface the alphabetical-field-naming constraint that the `build_cache_key` doc-comment captures. A developer reading the struct doc first when adding a new field would not see the constraint until they ran the byte-identity tests (which catch the breakage at authorship, but the upfront warning is cheaper). Added a paragraph documenting the constraint with a cross- reference to `build_cache_key`. Declined the LOW field-name-in-error-test pin: the assertion boundary the existing 8 type-mismatch tests pin is loud-vs-silent, which IS the API contract; field-name presence in the error string is implementation-quality, not contract. Pinning it would lock the present error wording (which we explicitly accepted dropping serde's column-number context for) — keeping the boundary contractual rather than wording-coupled. General reviewer flagged this as a judgment call. cargo +stable clippy --workspace --all-targets --all-features -- -D warnings: green cargo test -p marque-wasm: 113 tests green (including all 19 R2 unit tests)

Copilot

Pull request overview

This PR continues the WASM bundle-size reduction work (#689 follow-up) by removing serde derive-based config parsing/serialization from marque-wasm’s parse_wasm_config path, replacing it with manual serde_json::Value extraction and direct serde_json::Map cache-key construction while preserving cache-key byte identity via new regression tests.

Changes:

Replace #[derive(Deserialize)] WasmConfig parsing with explicit Value-based field extraction (wasm_config_from_value) and improved field-scoped type-mismatch errors.
Remove WasmConfigCacheKey + #[derive(Serialize)] and build the cache key via direct serde_json::Map construction (including an f32 formatting workaround to preserve byte identity).
Add extensive unit tests to pin cache-key byte identity, loud failure on type mismatches, and silent-ignore behavior for unknown fields.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
crates/wasm/src/lib.rs	Refactors WASM config parsing/cache-key construction away from serde derives; adds helpers + regression tests to preserve cache semantics and behavior.
claudedocs/wasm-bundle-analysis-2026-05-22.md	Updates investigation notes with measured R2 results and details of the landed approach.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+/// explicit `obj.get(...)` form additionally tightens the
+/// runtime-config surface relative to the derive (the closed
+/// accept-list is enumerated at the call site, not implied by the
+/// struct shape).


+        let mut corrections_map = serde_json::Map::new();
+        for (k, v) in corrections {
+            corrections_map.insert(k.clone(), serde_json::Value::String(v.clone()));
+        }
+        map.insert(
+            "corrections".to_owned(),
+            serde_json::Value::Object(corrections_map),
+        );


+        // `#[derive(Serialize)] WasmConfigCacheKey` path (which used
+        // `serialize_f32` directly). The format-then-parse roundtrip
+        // here is cheap (one short alloc, one parse of a known-valid
+        // numeric literal) and runs only on engine-cache-miss.


empirically rejected) Copilot's PR #697 review surfaced three inline comments: **#1 (lib.rs:842) — accepted**: doc-comment claimed R2 "tightens the runtime-config surface relative to the derive". Copilot correctly pointed out that the pre-R2 `#[derive(Deserialize)]` on `Option<String>` / `Option<f32>` / `Option<HashMap<String,String>>` / `Option<f64>` was ALREADY type-strict and ALREADY had a closed accepted-field set. There is no concrete pre-R2 case that was accepted and is now rejected. The synthesis brief AND the rust- specialist review both misclaimed "tightening"; Copilot caught it. Reworded the `WasmConfig` doc-comment to state the post-R2 surface matches the pre-R2 Derive exactly — same four field names, same `Option<T>` shape, same loud-failure-on-mismatch, same silent-ignore. The R2 change is structural (binary-size cut + improved error messages with field names), NOT a tightening. **#3 (lib.rs:1104) — accepted**: f32 format-parse roundtrip comment claimed it "runs only on engine-cache-miss" but `build_cache_key` runs on every `parse_wasm_config` call (which is every lint_native / fix_native call), regardless of whether the resulting cache key hits or misses the engine cache. Corrected the comment. **#2 (lib.rs:1124) — empirically REJECTED**: Copilot suggested replacing the `serde_json::Map`-with-cloned-keys path with a borrow-preserving `Vec<(&str,&str)>` sort + direct JSON string construction, to eliminate per-call `corrections` key/value clones. The suggestion is theoretically correct on runtime: O(n) String clones become O(n) borrow-preserving sort comparisons. But empirically — measured locally with `wasm-pack build --target web --profile release-web` (CI-gate profile) — the borrow-preserving rewrite REGRESSED WASM bundle size from 1,364,312 B (original R2) to 1,370,786 B (with Copilot's suggested rewrite). That's a +6,474 B regression — would erase 83% of R2's −7,750 B binary win. The mechanism is the same LTO+ICF lesson captured in PR #696 (R1's `sorted_entries` revert at `crates/capco/src/lattice.rs:249-278`): the `serde_json::Map` / `Value::Object` / `to_string::<Value>` path REUSES serde_json machinery already monomorphized for the Diagnostic JSON output side of `marque-wasm`. The direct-String-construction path forces fresh monomorphizations of `serde_json::to_string::<str>` plus a longer push-based code path. Theoretical clone-count "wins" don't beat empirical binary-size measurements. The runtime cost Copilot flagged is real but small: typical corrections maps in marque usage are 5-20 entries; per-call cost is 10-40 short String allocations, in the noise relative to the ~400 µs lint hot path. The binary savings (6.5 KB) dwarf the runtime cost. Trade favors keeping the original implementation. Documented the trade in the `build_cache_key` doc-comment with a cross-reference to PR #696's empirical-revert precedent — future maintainers see WHY the seemingly-suboptimal clones are preserved. Also brought along: structural Some-and-non-empty proof via `.filter()` (rust-specialist nit from review fix-up b974550) stays in place; that one didn't depend on the body shape Copilot suggested. Empirical measurements (wasm-pack --profile release-web, local): - staging baseline: 1,372,062 B - original R2 (b974550, serde_json::Map): 1,364,312 B (−7,750) - Copilot #2 fix (direct String): 1,370,786 B (−1,276) ← rejected - this commit (revert + doc fixes): 1,364,386 B (−7,676) cargo +stable clippy --workspace --all-targets --all-features -- -D warnings: green cargo test -p marque-wasm --lib r2_: 19/19 green (byte-identity preserved)

bashandbone · 2026-05-22T19:15:11Z

Thanks @copilot for the three inline comments — all three were on-target. Disposition + commit b41947a9:

#1 (lib.rs:842) — ACCEPTED. You were right. The pre-R2 #[derive(Deserialize)] on Option<String> / Option<f32> / Option<HashMap<String,String>> / Option<f64> was already type-strict with a closed accepted-field set; there's no concrete pre-R2 case that was accepted and is now rejected. The "tightening" claim was a misreading that survived the synthesis brief AND the rust-specialist review without being caught. Reworded the WasmConfig doc-comment to state the post-R2 surface matches the pre-R2 Derive exactly — same field names, same Option<T> shape, same loud-failure-on-mismatch, same silent-ignore. The R2 change is structural (binary-size cut + improved error messages with field names), NOT a tightening.

#3 (lib.rs:1104) — ACCEPTED. Comment correction. build_cache_key runs on every parse_wasm_config call (which is every lint_native / fix_native call), not only on engine-cache miss. Fixed.

#2 (lib.rs:1124) — empirically REJECTED with documented rationale. Your suggestion is theoretically correct on runtime (eliminate O(n) String clones via borrow-preserving Vec<(&str,&str)> + direct JSON construction). I implemented it and measured on wasm-pack build --target web --profile release-web (CI-gate-comparable profile):

approach	WASM size	delta vs staging
staging baseline	1,372,062 B	0
original R2 (`serde_json::Map`-with-clones)	1,364,312 B	−7,750 B
your suggested rewrite (direct String construction)	1,370,786 B	−1,276 B
current commit (revert + doc fixes)	1,364,386 B	−7,676 B

The borrow-preserving rewrite REGRESSED the bundle by +6,474 B — would erase 83% of R2's binary win. The mechanism is the same LTO+ICF lesson captured in PR #696 (R1's sorted_entries revert): the serde_json::Map / Value::Object / to_string::<Value> path reuses serde_json machinery already monomorphized for the Diagnostic JSON output side of this crate; the direct-String approach forces fresh monomorphizations of serde_json::to_string::<str> plus a longer push-based code path.

The runtime cost you flagged is real but small in practice — typical corrections maps in marque usage are 5-20 entries; per-call cost is 10-40 short String allocations, in the noise relative to the ~400 µs lint_10kb hot path. The binary savings (6.5 KB) dwarf the runtime cost. The trade-off was documented in the build_cache_key doc-comment with a cross-reference to PR #696's empirical-revert precedent so future maintainers see WHY the seemingly-suboptimal clones are preserved.

cargo +stable clippy --workspace -- -D warnings: green. cargo test -p marque-wasm: 19/19 R2 byte-identity tests green.

…p) (#697) fixtures for parse_wasm_config (#689) fix(wasm): #689 R2 review fix-up — structurally self-proving corrections guard + alphabetical-naming constraint cross-reference fix(wasm): #689 R2 review fix-up — Copilot doc-comment corrections (+ #2 empirically rejected)

bashandbone added 3 commits May 22, 2026 14:41

Copilot AI review requested due to automatic review settings May 22, 2026 19:05

bashandbone requested a review from a team as a code owner May 22, 2026 19:05

bashandbone added the performance Throughput, latency, bounded work, caching, and SC-001/SC-005 perf gates label May 22, 2026

Copilot started reviewing on behalf of bashandbone May 22, 2026 19:05 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

Copilot started work on behalf of bashandbone May 22, 2026 19:15 View session

Copilot finished work on behalf of bashandbone May 22, 2026 19:16

bashandbone added this to the EPIC Lattice Implementation milestone May 22, 2026

bashandbone linked an issue May 22, 2026 that may be closed by this pull request

perf(wasm): investigate +5.18% WASM binary regression vs. baseline (cumulative across staging window) #689

Closed

Merge branch 'staging' into refactor/r2-wasm-config-serde-reduction

b4c50e9

bashandbone merged commit 4ad5ee1 into staging May 22, 2026
13 checks passed

bashandbone deleted the refactor/r2-wasm-config-serde-reduction branch May 22, 2026 19:30

bashandbone mentioned this pull request May 22, 2026

perf(wasm): investigate +5.18% WASM binary regression vs. baseline (cumulative across staging window) #689

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(wasm): R2 — parse_wasm_config serde reduction (#689 follow-up)#697

refactor(wasm): R2 — parse_wasm_config serde reduction (#689 follow-up)#697
bashandbone merged 6 commits into
stagingfrom
refactor/r2-wasm-config-serde-reduction

bashandbone commented May 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

bashandbone commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bashandbone commented May 22, 2026

Measurements

Summary

Constitution III check — runtime-config does not expand engine's semantic surface

f32 byte-identity gotcha

Behavior changes — errors from the Derive surface

Adjacent-bug survey

Constitution checks

Test plan

Review-pass changes (commit b974550b)

References

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

bashandbone commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Review-pass changes (commit `b974550b`)