re_renderer: cross-platform shader imports & build-system#226
Conversation
Wumpf
left a comment
There was a problem hiding this comment.
amazing work, love it!
The only thing that tripped me off was the env var query in the release version of the file macro, that seems wrong to me and I believe that corner could be simplified with a more pragmatic view on path normalization (aka never do canonicalize in memfs world ever, also do canonicalize/normalize late), but I might be missing details.
I think a lot of the confusion comes from the fact that this is a compile-time |
| // The path returned by the `file!()` macro is always hermetic, and we pre-load | ||
| // our run-time virtual filesystem using the exact same hermetic prefix. | ||
| // | ||
| // This is mandatory to get right, otherwise we couldn't compare paths at | ||
| // runtime and everything would crumble! | ||
| let path = if cfg!(not(target_arch = "wasm32")) { | ||
| // Canonicalize the path in non-wasm release builds, as we do have an actual | ||
| // filesystem to rely on. | ||
| use anyhow::Context as _; | ||
| ::std::fs::canonicalize(&path) | ||
| .with_context(|| format!("failed to canonicalize path at {path:?}")) | ||
| .unwrap() | ||
| } else { | ||
| // Best we can do on wasm is lexicographically normalize the path. | ||
| clean_path::clean(&path) | ||
| }; | ||
|
|
||
| if cfg!(not(target_arch = "wasm32")) { | ||
| // On native, we want to make sure to strip the local workspace prefix from | ||
| // our paths, otherwise they won't make any sense at run-time, since the | ||
| // shader paths embedded into the binary are hermetic. | ||
| let strip_prefix = ::std::path::Path::new(env!("CARGO_WORKSPACE_DIR")) | ||
| .parent() | ||
| .unwrap(); | ||
| path.strip_prefix(strip_prefix).unwrap().to_owned() | ||
| } else { | ||
| // On wasm, the build system already takes care of hermeticism for us: all | ||
| // paths have the local workspace prefix pre-stripped. | ||
| // They even go a bit too far in fact: they remove the root folder from | ||
| // the path too. We need to bring that back. | ||
| ::std::path::Path::new("rerun").join(path) | ||
| } | ||
| // Therefore, the in-memory filesystem will actually be able to find this path, | ||
| // and canonicalize it. | ||
| $crate::get_filesystem().canonicalize(&path).unwrap() |
## Summary Lockfile-only bumps plus a couple of pnpm `overrides` to close as many open Dependabot security alerts as possible. ### Addressed (≈30 alerts) **pip / uv:** - `urllib3` → 2.7.0 (rerun, dataplatform — #260, #261, #252, #255) - `gitpython` → 3.1.50 (rerun_export, dataloader — #248, #249, #228, #229, #231, #232) - `pygments` → 2.20.0 (rerun, dataplatform — #139, #140) - `pynacl` → 1.6.2 (#24) - `marshmallow` → 3.26.2 (#23) - `filelock` → 3.29.0 (#21, #32) - `virtualenv` → 21.3.1 (#31) - `uv` → 0.11.13 (#170) - `torch` → 2.11.0 in rerun/uv.lock (#43) — examples couldn't bump (lerobot/diffusers chain) - `flask` → 3.1.3 (#66) **npm:** - `fast-uri` → 3.1.2 (#240, #241) - `postcss` → 8.5.14 (#199) - `cookie` → 0.7.2 via `pnpm.overrides` in docs and landing (#87, #105) — SvelteKit still pins 0.6 transitively **cargo:** - `rand` → 0.8.6 in rerun and dataplatform (#192, #193) ### Skipped - **diffusers 0.38** (#226, #230, #233, #238) — pulls a `safetensors` pre-release, see #1949 - **transformers 5.0.0rc3** (#163) — RC, intentionally not pulled - **lru 0.16** (#19, #20) — pinned by `tantivy 0.24` (transitive via `lance`) - **thrift** (#239) — no patched version available - **torch in examples** (#195, #196) — constrained by lerobot/diffusers chain - **pytest** (#208) — `rerun_py/pyproject.toml` already pins `pytest==9.0.3`; alert is stale 🤖 Generated with [Claude Code](https://claude.com/claude-code) Source-Ref: ce3edfea8c4ea60c4c13bfef2ae235cd477384f3 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… uv) ## Summary - **idna** 3.11/3.12 → 3.17: fixes bypass of CVE-2024-3651 fix in `idna.encode()` - **pymdown-extensions** 10.19 → 10.21.3: fixes path traversal bypass in `snippets` - **pygments** 2.19.2 → 2.20.0: fixes ReDoS via inefficient GUID regex; removes the `<2.20` pin since pymdown-extensions 10.21.2 fixed the `None` filename compat issue - **uv** 0.11.13 → 0.11.17: fixes arbitrary file write via entry point names Closes Dependabot alerts: #140, #280, #281, #282, #283, #284, #290 ### Already dismissed (can't fix in this PR) - **#239** thrift/rust — no upstream fix available - **#195, #196, #230, #233, #286, #287** — diffusers/torch in examples, blocked by lerobot pinning `diffusers<0.36.0` and `torch<2.8.0` ### Still open - **diffusers 0.38.0** (#226, #238, #288, #289) — requires `safetensors>=0.8.0-rc.0`; no stable safetensors 0.8.0 exists yet - **transformers** (#163) — fix only in 5.0.0rc3 ## Test plan - [x] Python docs build still works (`uv run --group docs mkdocs build`) - [x] Lock files resolve cleanly (verified via `uv lock`) Source-Ref: 6b5a6aba9e09f25efcecaf9e1f03729ffd5f54eb Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses open [Dependabot alerts](https://github.com/rerun-io/reality/security/dependabot) across the Python `uv.lock` files. ## Fixed | Package | Bump | Severity | Alerts | |--------------|------------------------------|----------|------------------------------| | `pyarrow` | → 23.0.1 (dp) / 24.0.0 (rr) | high | #297, #298 | | `diffusers` | → 0.38.0 | high | #226, #238, #288, #289 | | `aiohttp` | → 3.14.1 | medium | #291–#296 | | `torch` | → 2.12.0 (dataplatform) | low | #305 | ## pyproject changes - **dataplatform**: `pyarrow` pin `19.0.1` → `23.0.1` - **controlnet**: lifted the `diffusers<0.38` cap — the safetensors pre-release it was avoiding is now a stable release (`safetensors 0.8.0`) The big `dataplatform/uv.lock` churn is the `torch` 2.9 → 2.12 bump swapping the NVIDIA `cu12` wheels for `cu13`. ## Not addressed (no safe fix available) - **torch `torch.jit.script`** (#299, #304, #306, #307) — no patched version exists; vulnerable through `<= 2.12.0`. - **torch in examples** (#300–#303) — `lerobot` caps torch to `<2.8.0`; the fix needs `>= 2.10.0`. - **transformers `Trainer` RCE** (#163) — only fix is the `5.0.0rc3` pre-release. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Source-Ref: 16f255919290c6df8faf38fe0f4efb97755c29d0 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Alright, I'll just quote the module docs directly.
Main module
This module implements one half of our cross-platform #import system.
The other half is provided as an extension to the build system, see the
build.rsfileat the root of this crate.
While it is agnostic to the type of files being imported, in practice this is only used
for shaders, thus this is what this documentation will linger on.
In particular, integration with our hot-reloading capabilities can get tricky depending
on the platform/target.
Usage
#import <x/y/z/my_file.wgsl>Syntax
Import clauses follow the general form of
#import <x/y/z/my_file.wgsl>.The path to be imported can be either absolute or relative to the path of the importer,
or relative to any of the paths set in the search path (
RERUN_SHADER_PATH).The actual parsing rules themselves are very barebones:
#import(exl. whitespaces).<and the last>is interpreted as the importpath, as-is. We do so because, between the 4 major platforms (Linux, macOS, Window, Web),
basically any string is a valid path.
Everything is
trim()ed at every step, you do not need to worry about whitespaces.Resolution
Resolution is done in three steps:
First, we try to interpret the imported path as absolute.
1.1. If this is possible and leads to an existing file, we're done.
1.2. Otherwise, we go to 2.
Second, we try to interpret the imported path as relative to the importer's.
2.1. If this leads to an existing file, we're done.
2.2. Otherwise, we go to 3.
Finally, we try to interpret the imported path as relative to all the directories
present in the search path, in their prefined priority order, similar to e.g. how
the standard
$PATHenvironment variable behaves.3.1. If this leads to an existing file, we're done.
3.2. Otherwise, resolution failed: throw an error.
Interpolation
Interpolation is done in the simplest way possible: the entire line containing the import
clause is overwritten with the contents of the imported file.
This is of course a recursive process.
A word about
#pragmasemanticsImports can behave in two different ways:
#pragma onceand#pragma many.#pragma oncemeans that each unique #import clause is only be resolved once even if itused several times, e.g. assuming that
a.txtcontains the string"xyz"then:becomes
#pragma manyon the other hand will resolve the clause as many times as it is used:becomes
Both have their use cases and drawbacks, there's no "right solution".
At the moment, our import system only provides support for
#pragma manysemantics.We will most likely add support for
#pragma onceat some point.Hot-reloading: platform specifics
This import system transparently integrates with the renderer's hot-reloading capabilities.
What that actually means in practice depends on the platform/target.
A general over-simplification of what we're aiming for can be expressed as:
When targeting native debug builds, we want everything to be as lazy as possible, everything
to happen just-in-time, e.g.:
create_shader_module.On the web, we don't even have an actual filesystem to access at runtime, so not only we'd
like to be as eager can be, we don't have much of a choice to begin with.
That said, we don't want to be too eager either: while we do have to make sure that every
single shader that we're gonna use (whether directly or indirectly via an import) ends up
in the final artifact one way or another, we still want to delay interpolation as much as
we can, otherwise we'd be bloating the binary artifact with N copies of the exact same
shader code.
Still, we'd like to limit the number of differences between targets/platforms.
And indeed, the current implementation uses a virtual filesystem approach to effectively
remove any difference between how the different platforms behave at run-time.
Debug builds (excl. web)
Native debug builds are straightforward:
No surprises there.
Release builds (incl. web)
Things are very different for release artifacts, as 1) we disable hot-reloading there and
2) we never interact with the OS filesystem at run-time.
Still, in practice, we handle release builds just the same as debug ones.
What happens there is we have a virtual, hermetic, in-memory filesystem that gets pre-loaded
with all the shaders defined within the Cargo workspace.
This happens in part through a build script that you can find at the root of this crate.
From there, everything behaves exactly the same as usual. In fact, there is only one code
path for all platforms at run-time.
There are many issues to deal with along the way though: paths comparisons across
environments and build-time/run-time, hermeticism, etc...
We won't cover those here: please refer to the code if you're curious.
For developers
Canonicalization vs. Normalization
Comparing paths can get tricky, especially when juggling target environments and
run-time vs. compile-time constraints.
For this reason you'll see plenty mentions of canonicalization and normalization all over
the code: better make sure there's no confusion here.
Canonicalization (i.e.
std::fs::canonicalize) relies on syscalls to both normalize a path(including following symlinks!) and make sure the file it references actually exist.
It's the strictest form of path normalization you can get (and therefore ideal), but
requires 1) to have access to an actual filesystem at run-time and 2) that the file
being referenced already exists.
Normalization (not available in
std) on the other hand is purely lexicographical: itnormalizes paths as best as it can without ever touching the filesystem.
See also "Getting Dot-Dot Right".
Hermeticism
When shipping release artifacts (whether web or otherwise), we want to avoid leaking state
from the original build environments into the final binary (think: paths, timestamps, etc).
We need to the build to be hermetic.
Rust's toolchain already takes care of that to some extent, and we need to match that
behaviour on our side (e.g. by not leaking local paths), otherwise we won't be able to
compare paths at runtime.
Think of it as
chrooting into our Cargo workspace :)In our case, there's an extra invariant on top on that: we must never embed shaders from
outside the workspace into our release artifacts!
Things we don't support
#import <myshader>formyshader.wglsl.Build extension
This build script implements the second half of our cross-platform shader #import system.
The first half can be found in
src/file_resolver.rs.It finds all WGSL shaders defined anywhere within our Cargo workspace, and embeds them
directly into the released artifact for our
re_rendererlibrary.At run-time, for release builds only, those shaders will be available through an hermetic
virtual filesystem.
To the user, it will look like business as usual.
See
re_renderer/src/workspace_shaders.rsfor the end result.