Fix Area Lights PBR crash: split sampler2D function parameters#1718
Merged
bkaradzic-microsoft merged 2 commits intoMay 28, 2026
Merged
Conversation
Adds a new SamplerFunctionParameterSplitter traverser that complements
the existing SplitSamplersIntoSamplersAndTextures pass. SplitSamplers
rewrites global sampler2D uniforms into separate texture2D + sampler
declarations, but it does not rewrite functions that take sampler2D
*by value*. Any user-defined helper that took a sampler2D parameter
(common in Babylon.js shaders, notably the LTC code path used by
PBRMaterial area lights) would therefore be left with a parameter
type the rest of the pipeline can no longer satisfy.
The new traverser does the two missing things:
1. Rewrites function definitions: every sampler2D parameter is
replaced with a pair (texture2D t, sampler s), and uses of the
original parameter inside the body are rewritten to
sampler2D(t, s) constructor calls.
2. Rewrites every call site: at each EOpFunctionCall that resolves
to a rewritten function, every sampler2D argument is replaced
by the (texture, sampler) pair sourced from the global rewrite
ids produced by SplitSamplersIntoSamplersAndTextures.
The subtle bit -- and the reason this had to be a dedicated traverser
rather than an extension of SplitSamplers -- is that glslang's call
nodes carry a parallel qualifier list alongside the argument list,
and GlslangToSpv reads qualifiers[i] for arg i. If the qualifier list
is not extended in lockstep with the inserted texture/sampler args,
the SPIR-V emitter walks off the end of the qualifier list, the
trailing arg never gets the OpStore that materializes its value, and
the eventual access chain into the uninitialized texture variable
hits a "node is rvalue, expected lvalue" assert deep in SpvBuilder.
Wired into all four backends that consume SplitSamplers (DXBC, DXIL,
Metal, Vulkan), with the new pass running immediately after
SplitSamplersIntoSamplersAndTextures so the global rewrite ids are
available.
Also bumps the SPIRV-Cross pin to pick up BabylonJS/SPIRV-Cross#12,
which exposes the HLSL image_type_hlsl / image_type_hlsl_legacy
implementations in WEBMIN builds. Without that companion change, the
same Area Lights test that previously crashed in glslang now reaches
SPIRV-Cross HLSL emission and throws "Invalid call." from the WEBMIN
stub at spirv_hlsl.cpp:20234.
Net effect: Playground visual test idx 194 "Area Lights - PBR" now
renders and validates against the reference image; no regressions in
the sampled idx 0/10/50/100 sweep.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a crash in the Babylon Native shader compiler when Babylon.js shaders contain user-defined functions that take combined sampler2D (etc.) parameters by value — notably the "Area Lights – PBR" playground (idx 194). The existing SplitSamplersIntoSamplersAndTextures pass rewrites global uniform samplers into (texture, sampler) pairs but leaves user-function signatures with the now-unsupported combined-sampler parameter type, which tripped a deep accessChain.isRValue() assert in glslang's SPIR-V emitter.
Changes:
- Adds a new
SamplerFunctionParameterSplittertraverser that rewrites eachsampler2Dfunction parameter into a(texture, sampler)pair, replaces uses inside the body withEOpConstructTextureSampler(t, s)constructions, and re-mangles the function name; then unpacks each matching call-site argument into two args while keeping glslang's parallel per-arg qualifier list in lockstep. - Wires the new pass in after
SplitSamplersIntoSamplersAndTexturesin all four backends (DXBC, DXIL, Metal, Vulkan). - Bumps the
SPIRV-CrossGit pin to pick up un-guarding ofimage_type_hlsl[_legacy]in WEBMIN builds, so HLSL emission no longer throws after glslang stops asserting.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| Plugins/ShaderCompiler/Source/ShaderCompilerTraversers.h | Declares SplitSamplerFunctionParameters with rationale doc comment. |
| Plugins/ShaderCompiler/Source/ShaderCompilerTraversers.cpp | Implements the new traverser (function-def rewrite + body-symbol rewrite + call-site rewrite with synchronized qualifier list). |
| Plugins/ShaderCompiler/Source/ShaderCompilerDXBC.cpp | Invokes the new pass after sampler splitting. |
| Plugins/ShaderCompiler/Source/ShaderCompilerDXIL.cpp | Invokes the new pass after sampler splitting. |
| Plugins/ShaderCompiler/Source/ShaderCompilerMetal.cpp | Invokes the new pass after sampler splitting. |
| Plugins/ShaderCompiler/Source/ShaderCompilerVulkan.cpp | Invokes the new pass after sampler splitting. |
| CMakeLists.txt | Bumps SPIRV-Cross pin to the WEBMIN-unguarded HLSL image-type build. |
Per Copilot review on BabylonJS#1718: the previous conditional if (idx < qualifiers.size()) qualifiers.insert(...); silently fell back to the buggy path -- if the qualifier list was ever shorter than args, the qualifier insert was skipped, and we re-introduced exactly the OOB-read -> elided OpStore -> uninitialized texture access -> accessChain.isRValue() assert chain this pass was created to fix. glslang's GlslangToSpv::handleUserFunctionCall indexes qualifiers[i] for every arg i with no bounds check, so the invariant qualifiers.size() == args.size() must hold by construction for any EOpFunctionCall reaching the SPIR-V emitter. Check that invariant up front and throw on violation, matching the other defensive throws in this rewriter. The per-iteration qualifier insert is now unconditional. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CedricGuillemet
approved these changes
May 28, 2026
bkaradzic-microsoft
added a commit
that referenced
this pull request
May 28, 2026
…hts, OpenPBR Analytic Lights) (#1721) ## Summary Babylon.js v6+ shaders for Area Lights (`computeAreaLighting`) and a number of other PBR/OpenPBR helpers declare a struct local without an initializer, then immediately do read-modify-write on its fields: ```glsl lightingInfo result; result.specular += value; result.diffuse += value; return result; ``` This is undefined behavior under the GLSL spec (locals are not zero-initialized), but every WebGL driver in practice tolerates it. SPIRV-Cross faithfully reproduces the pattern in its HLSL output, and `D3DCompile` correctly rejects it with `error X4000: variable used without having been completely initialized`. Result: `Area Lights - Standard Material` and any other test that exercises this code path fails to compile its fragment shader on D3D11 / D3D12 today. ## Fix Add a new AST traverser, `ZeroInitializeStructLocals`, that walks every function body and prepends `EOpAssign(symbol, EOpConstructStruct(0, 0, ...))` for each unique struct-typed `EvqTemporary` symbol it references. The SPIR-V emitter then sees a normal store; SPIRV-Cross emits HLSL with the struct local definitely initialized; `D3DCompile` accepts it. Any subsequent real assignment in the body is dead-code eliminated by DXC's optimizer, so functional behavior is unchanged. Wired into the DXBC, DXIL, Metal, and Vulkan compile pipelines (same as the existing sampler-handling and dFdy traversers). Arrays of structs are intentionally skipped — they are uncommon in BJS shaders and would require constructing a synthetic array constructor. ## Verification End-to-end on `build/win32-default` (D3D11, Chakra, RelWithDebInfo) and `build/win32-gl` (OpenGL/ANGLE, V8, Release): | Test | D3D11 | OpenGL | |---|---|---| | idx 191 Area Lights - Standard Material | PASS (1027 px diff) | PASS (1131 px diff) | | idx 194 Area Lights - PBR (regression check for #1718) | PASS | PASS | | idx 558 OpenPBR Fuzz Weight vs Fuzz Roughness - Analytic Lights | PASS | PASS | | idx 559 OpenPBR Fuzz Weight vs Coat Weight w/ Normal Maps - Analytic Lights | PASS | PASS | | idx 574 OpenPBR Transmission Dispersion VS IOR - Analytic Lights | PASS | PASS | | idx 0/39/47/165 (baseline regression checks) | PASS | PASS | Wide regression sweep: - D3D11: 135 indices (1-30, 100-115, 160-180, 285-305, 380-405, 540-560) → 134 PASS / 1 FAIL. The single failure is idx 22 `Outline` with `Error: Unsupported alpha mode: -1` — pre-existing, unrelated to shader pipeline. - OpenGL: 113 indices (1-50, 100-120, 250-270, 400-420) → 52 PASS / 60 SKIP / 1 FAIL. Same idx 22 failure. No new regressions on either backend. ## Why a BN traverser (not a SPIRV-Cross fix) Both the SPIRV-Cross fork-extension route and the BN traverser route would solve this. We chose the traverser because: 1. It avoids extending the BabylonJS/SPIRV-Cross fork further. 2. The fix lives next to the existing sampler/dFdy traversers, which use the exact same pattern. 3. It applies uniformly to all four backends (DXBC, DXIL, Metal, Vulkan) with one change. ## Out of scope - idx 192/193 (`Textured - Area Lights - *`) still fail with `Unsupported texture format or type: format 5, type 3553`. This is a `NativeEngine::PrepareImage` whitelist gap and orthogonal to this PR. - A separate cluster of OpenPBR-IBL tests (idx 557/561/575) fails with a JS-side `TypeError: 'FLOAT' undefined` before any shader compile. Will be filed as a separate issue. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft
added a commit
that referenced
this pull request
Jun 5, 2026
…ctim tests (#1715) Rebased onto latest `master`. The original "combined CI smoke" branch carried the bgfx MSAA+mipmaps fix, the bgfx.cmake bumps, the bx-API/bimg `imageParse` adaptations, and a large cascade-victim test re-enable/re-exclude experiment. **All of that is now superseded by `master`:** - `master`'s bgfx.cmake pin (`8711456e`) is a descendant of this branch's old pin (`5a00e8df`), so the MSAA+mipmaps fix and the bx/bimg API updates are already in. - `master` already passes `bx::ErrorIgnore` to `bimg::imageParse` (NativeEngine.cpp), already fixed Area Lights / OpenPBR (#1718/#1721), and already re-enabled the relevant Playground tests (#1716/#1719/#1698). What remains genuinely unique is the **Canvas framebuffer-pool survivability** change, which `master` still lacks (it has the bare `assert(handle.idx != bgfx::kInvalidHandle)` at `Canvas.cpp:179` and no `BGFX_CONFIG_MAX_FRAME_BUFFERS` propagation). ### This PR (single commit) The Canvas polyfill creates a fresh bgfx framebuffer per JS `Canvas` instance and per text-rendering op. Under V8 GC pacing, finalizers run in batches, so a long Playground sweep can outpace the default 128-slot bgfx framebuffer pool. The first allocation past the cliff returned `BGFX_INVALID_HANDLE`; the bare `assert` tripped in Debug, or the invalid handle bled into bgfx-internal arrays in Release (later AV at `bgfx_p.h`). The whole run aborted regardless of which test was unlucky. - `Dependencies/CMakeLists.txt`: propagate `BGFX_CONFIG_MAX_FRAME_BUFFERS` to the bgfx target (default 512 = 4x bgfx default; honors the `-D` override CI already passes — previously a no-op). - `Canvas.cpp` `UpdateRenderTarget()`: on `kInvalidHandle`, free the two just-allocated textures and `throw std::runtime_error`; `Context::Flush` rethrows as `Napi::Error` so the offending test fails cleanly without taking the rest of the sweep down. - `FrameBufferPool::Add`: same treatment for symmetry. The pool bump alone is expected to remove the cascade in practice; the throw paths are defense-in-depth so future regressions surface as a catchable JS error instead of a silent process kill. _Dropped from the previous revision (now redundant): bgfx.cmake bumps, bx-API/SIMD adaptation, nanovg `bx/math.h` include, bimg `imageParse` ErrorIgnore source changes, and the cascade-victim `config.json` re-enable/re-exclude churn + `validation_native.js` continue-past-failures scaffolding. The continue-past-failures runner improvement can be revived as its own focused PR if still wanted._ --- ### Plus: 25 crash-cascade victim test re-enables The framebuffer-pool abort above was the direct cause of a large family of Playground tests being excluded as Win32 D3D11 `ACCESS_VIOLATION` / `STATUS_BREAKPOINT` "cascade victims" — they were never broken individually; the suite just crashed near them when the FB pool overflowed, and whichever test was running got blamed. With the abort fixed, this PR re-enables **25** of those tests. Each was verified to introduce **zero** downstream regressions in a full headless sweep on **both** the D3D11 and ANGLE/OpenGL backends (185 ran / 185 passed / 0 failed on each). **Deferred to a follow-up PR (not re-enabled here):** the remaining cascade-victim tests are blocked by a single distinct bug — `cube-with-holes-using-stencil-buffer` (`#CW5PRI#11`) is a render-state corruptor that leaves the stencil buffer dirty and produces a ~38-test downstream pixel cascade on both backends (it is the same corruptor behind the previously-observed `sphere-…-glow-layer` D3D11 failure). That needs its own fix before its neighbours can be safely re-enabled. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft
added a commit
to bkaradzic-microsoft/BabylonNative
that referenced
this pull request
Jun 5, 2026
Follow-up to BabylonJS#1715. Now that the Canvas framebuffer-pool fix (BabylonJS#1715) and the Area Lights / OpenPBR shader fixes (BabylonJS#1718 sampler2D split, BabylonJS#1721 zero-init struct locals) have landed, a further 52 previously-excluded Playground tests pass cleanly. Each was verified with a full headless sweep on BOTH Win32 D3D11 and ANGLE OpenGL locally (0 failures, 237 ran / 237 passed). - 23 Area Lights / OpenPBR Analytic Lights tests fixed by BabylonJS#1718 + BabylonJS#1721. - 29 framebuffer-pool cascade victims fixed by BabylonJS#1715. Still excluded (deferred): 6 OpenPBR pixel-diffs, 2 OpenGL-only failures (UV2 Morphing, Vertex Pulling Morph Targets and Bones), the Linux pixel-diff set, and two state corruptors that need their own fixes: cube-with-holes-using-stencil-buffer (dirty stencil) and GLTF Node visibility test (dirties OpenGL state, breaks downstream MeshDebugPluginMaterial). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft
added a commit
that referenced
this pull request
Jun 5, 2026
## Summary Follow-up to #1715. Re-enables **52 more** previously-excluded Playground tests that now pass after recent fixes landed on `master`: - **23** Area Lights / OpenPBR Analytic Lights tests — fixed by #1718 (sampler2D function-parameter split) and #1721 (zero-init struct-typed locals). - **29** Canvas framebuffer-pool cascade victims — fixed by #1715 (FB-pool survivability). ## Verification Each test was verified with a full headless sweep on **both** Win32 D3D11 and ANGLE OpenGL locally: ``` Run complete. ran=237 passed=237 failed=0 missingRef=0 skipped=483 ``` 0 failures on either backend, and no regressions on the previously-active set (the sweep also confirms no downstream cascade onto already-enabled tests). ## Deferred (still excluded) | Group | Count | Why | |---|---|---| | OpenPBR pixel-diffs | 6 | render correctly now but differ from reference; need rebake/triage | | OpenGL-only failures | 2 | UV2 Morphing, Vertex Pulling Morph Targets and Bones | | Linux pixel-diffs | ~22 | real pixel comparison failures, separate triage | | State corruptors | 2 | `cube-with-holes-using-stencil-buffer` (dirty stencil) and `GLTF Node visibility test` (dirties OpenGL state -> breaks downstream `MeshDebugPluginMaterial`). Each needs its own fix before re-enabling. | The two corruptors were pinned down by bisection — re-enabled alone, each produces a downstream cascade — so they are intentionally held back. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a long-standing crash on shaders that use
sampler2Das a function parameter — most visibly the Babylon.js "Area Lights - PBR" Playground visual test (idx 194), which usessampler2Dparameters in the LTC code path of PBRMaterial. Before this PR the crash was a deepaccessChain.isRValue()assert in glslang'sSpvBuilder; with debug builds disabled, the assert fired throughBX_ASSERTand exited with code 3.Two things had to be fixed to make idx 194 render end-to-end on the default Windows/D3D11 build:
1. New traverser:
SamplerFunctionParameterSplitterThe existing
SplitSamplersIntoSamplersAndTexturespass rewrites globalsampler2Duniforms into a(texture2D, sampler)pair. It does not rewrite user-defined functions that takesampler2Dby value. Such helpers — common in Babylon.js — were therefore left with a parameter type the rest of the pipeline can no longer satisfy.SamplerFunctionParameterSplitterdoes the missing two halves:sampler2Dparameter is replaced with a(texture2D t, sampler s)pair, and uses of the original parameter inside the body are rewritten tosampler2D(t, s)constructor calls.EOpFunctionCallthat resolves to a rewritten function, everysampler2Dargument is replaced by the(texture, sampler)pair, sourced from the global rewrite ids produced bySplitSamplers….The subtle bit — and the reason this had to be a dedicated traverser rather than an extension of
SplitSamplers…— is that glslang's call nodes carry a parallel qualifier list alongside the argument list.GlslangToSpv.cpp::handleUserFunctionCallreadsqualifiers[i]for argi. If the qualifier list isn't extended in lockstep with the inserted texture/sampler args, the SPIR-V emitter walks off the end of the qualifier list, the trailing arg never gets theOpStorethat materializes its value, and the access chain into the uninitialized texture variable trips theaccessChain.isRValue()assert deep inSpvBuilder.Concretely:
Wired into all four backends that consume
SplitSamplers…(DXBC, DXIL, Metal, Vulkan), with the new pass running immediately after.2. SPIRV-Cross pin bump (BabylonJS/SPIRV-Cross#12)
Once glslang stopped asserting, the same Area Lights test reached SPIRV-Cross HLSL emission and threw
Invalid call.from the WEBMIN stub atspirv_hlsl.cpp:20234(CompilerHLSL::image_type_hlsl). BabylonJS/SPIRV-Cross#12 un-guardsimage_type_hlslandimage_type_hlsl_legacyin WEBMIN builds (same un-guarding pattern as the earlier #11). This PR bumps theSPIRV-Crosspin inCMakeLists.txtto pick up that change.Testing
RelWithDebInfoagainst the new SPIRV-Cross pin.Test 'Area Lights - PBR' validated, exit 0).--include-excluded) at idx 0/10/50: allvalidated. Idx 100/200/300/400/500 are pre-existing exclusions inconfig.jsonand remainskipped=1— no behavior change.Files
Plugins/ShaderCompiler/Source/ShaderCompilerTraversers.hPlugins/ShaderCompiler/Source/ShaderCompilerTraversers.cppPlugins/ShaderCompiler/Source/ShaderCompiler{DXBC,DXIL,Metal,Vulkan}.cppCMakeLists.txtGIT_TAGbump