@@ -83,7 +83,7 @@ To handle communication between our code on the CPU and GPU, we'll use
8383implements the WebGPU API. On the web, it works directly with the browser's WebGPU
8484implementation. On native platforms, it translates API calls to the platform's GPU API
8585(Vulkan, DirectX, or Metal). This lets us run the same code on a wide range of
86- platforms, including Windows, Linux, macOS, iOS[ ^ 1 ] , Android, and the web[ ^ 2 ] .
86+ platforms, including Windows, Linux, macOS[ ^ 1 ] , iOS[ ^ 2 ] , Android, and the web[ ^ 3 ] .
8787
8888By using Rust GPU and ` wgpu ` , we have a clean, portable setup with everything written in
8989Rust.
@@ -147,9 +147,9 @@ There are a couple of things to note about the Rust implementation:
1471474 . The inner loop (` for i in 0..dimensions.k ` ) uses Rust's ` for ` syntax with a range.
148148 This is a higher-level abstraction compared to manually iterating with an index in
149149 other shader languages like WGSL, GLSL, or HLSL.
150- 5 . Read-only inputs are immutable references (` &Dimensions ` / ` &[f32] ` ) and writeable outputs are
151- mutable references (` &mut [f32] ` ). This feels very familiar to anyone used to writing
152- Rust.
150+ 5 . Read-only inputs are immutable references (` &Dimensions ` / ` &[f32] ` ) and writable
151+ outputs are mutable references (` &mut [f32] ` ). This feels very familiar to anyone
152+ used to writing Rust.
153153
154154#### What's with all the ` usize ` ?
155155
@@ -181,7 +181,7 @@ Each workgroup, since it's only one thread (`#[spirv(compute(threads(1)))]`), pr
181181one ` result[i, j] ` .
182182
183183To calculate the full matrix, we need to launch as many entries as there are in the
184- matrix. Here we specify that (` Uvec3::new(m * n, 1, 1 ` ) on the CPU:
184+ ` m * n ` matrix. Here we specify that (` Uvec3::new(m * n, 1, 1 ` ) on the CPU:
185185
186186import { RustNaiveWorkgroupCount } from './snippets/naive.tsx';
187187
@@ -308,6 +308,14 @@ complete runnable code can be [found on
308308GitHub] ( https://github.com/Rust-GPU/rust-gpu.github.io/tree/main/blog/2024-11-21-optimizing-matrix-mul/code )
309309and you can run the benchmarks yourself with ` cargo bench ` .
310310
311+ ::: tip
312+
313+ You can also check out real-world projects using Rust GPU such as
314+ [ ` autograph ` ] ( https://github.com/charles-r-earp/autograph ) and
315+ [ ` rederling ` ] ( https://renderling.xyz/ ) .
316+
317+ :::
318+
311319## Reflections on porting to Rust GPU
312320
313321Porting to Rust GPU went quickly, as the kernels Zach used were fairly simple. Most of
@@ -320,9 +328,11 @@ is not _great_ as it is still blog post code!
320328
321329My background is not in GPU programming, but I do have Rust experience. I joined the
322330Rust GPU project because I tried to use standard GPU languages and knew there must be a
323- better way. Writing these GPU kernels felt like writing any other Rust code (other than
324- debugging, more on that later) which is a huge win to me. Not just the language itself,
325- but the entire development experience.
331+ better way.
332+
333+ Writing these GPU kernels felt like writing any other Rust code (other than debugging,
334+ more on that later) which is a huge win to me. Not just the language itself, but the
335+ entire development experience.
326336
327337## Rust-specific party tricks
328338
@@ -372,10 +382,10 @@ bug I couldn't figure out. GPU debugging tools are limited and `printf`-style de
372382often isn't available. But what if we could run the GPU kernel _ on the CPU_ , where we
373383have access to tools like standard debuggers and good ol' ` printf ` /` println ` ?
374384
375- With Rust GPU, this was straightforward. By using ` cfg() ` directives I made the
376- GPU-specific annotations (` #[spirv(...)] ` ) disappear when compiling for the CPU. The
377- result? The kernel became a regular Rust function. On the GPU, it behaves like a shader.
378- On the CPU, it's just a function you can call directly.
385+ With Rust GPU, this was straightforward. By using standard Rust ` cfg() ` directives I
386+ made the GPU-specific annotations (` #[spirv(...)] ` ) disappear when compiling for the
387+ CPU. The result? The kernel became a regular Rust function. On the GPU, it behaves like
388+ a shader. On the CPU, it's just a function you can call directly.
379389
380390Here's what it looks like in practice using the 2D tiling kernel from before:
381391
@@ -404,7 +414,7 @@ Testing the kernel in isolation is useful, but it does not reflect how the GPU e
404414it with multiple invocations across workgroups and dispatches. To test the kernel
405415end-to-end, I needed a test harness that simulated this behavior on the CPU.
406416
407- Building the harness was straightforward due to the borrow checker . By enforcing the
417+ Building the harness was straightforward due to due to Rust . By enforcing the
408418same invariants as the GPU I could validate the kernel under the same conditions the GPU
409419would run it:
410420
@@ -450,7 +460,7 @@ other Rust project.
450460
451461This required no new tools or workflows. The tools I already knew worked seamlessly.
452462More importantly, this approach benefits anyone working on the project. Any Rust
453- engineer can run these benchmarks with no additional setup-- ` cargo bench ` is a standard
463+ engineer can run these benchmarks with no additional setup— cargo bench` is a standard
454464part of the Rust ecosystem.
455465
456466### Lint
@@ -517,9 +527,9 @@ and `f64` without duplicating code, all while maintaining type safety and perfor
517527### Error handling with ` Result `
518528
519529Rust GPU also supports error handling using ` Result ` . Encoding errors in the type system
520- makes it clear where things can go wrong and forces developers to handle those cases.
521- This is particularly useful for validating kernel inputs or handling the many edge cases
522- in GPU logic.
530+ makes it clear where things can go wrong and forces you to handle those cases. This is
531+ particularly useful for validating kernel inputs or handling the many edge cases in GPU
532+ logic.
523533
524534### Iterators
525535
@@ -535,12 +545,13 @@ future.
535545
536546### Conditional compilation
537547
538- This kernel doesn't use conditional compilation, but it's a key feature of Rust that
539- works with Rust GPU. With ` #[cfg(...)] ` , you can adapt kernels to different hardware or
540- configurations without duplicating code. GPU languages like WGSL or GLSL offer
541- preprocessor directives, but these tools lack standardization across projects. Rust GPU
542- leverages the existing Cargo ecosystem, so conditional compilation follows the same
543- standards all Rust developers already know.
548+ While I briefly touched on it a couple of times, this kernel doesn't really show the
549+ full power of conditional compilation. With ` #[cfg(...)] ` and [ cargo
550+ "features"] ( https://doc.rust-lang.org/cargo/reference/features.html ) , you can adapt
551+ kernels to different hardware or configurations without duplicating code. GPU languages
552+ like WGSL or GLSL offer preprocessor directives, but these tools lack standardization
553+ across projects. Rust GPU leverages the existing Cargo ecosystem, so conditional
554+ compilation follows the same standards all Rust developers already know.
544555
545556## Come join us!
546557
@@ -551,7 +562,8 @@ or get involved, check out the [`rust-gpu` repo on
551562GitHub] ( https://github.com/rust-gpu/rust-gpu ) .
552563<br />
553564
554- [ ^ 1 ] : Via [ MoltenVK] ( https://github.com/KhronosGroup/MoltenVK )
555- [ ^ 2 ] :
556- Technically ` wgpu ` translates SPIR-V to GLSL or WGSL via
557- [ naga] ( https://github.com/gfx-rs/wgpu/tree/trunk/naga )
565+ [ ^ 1 ] : Technically ` wgpu ` uses [ MoltenVK] ( https://github.com/KhronosGroup/MoltenVK ) or translates to Metal on macOS
566+ [ ^ 2 ] : Technically ` wgpu ` uses [ MoltenVK] ( https://github.com/KhronosGroup/MoltenVK ) or translates to Metal on iOS
567+ [ ^ 3 ] :
568+ Technically ` wgpu ` translates SPIR-V to GLSL (WebGL) or WGSL (WebGPU) via
569+ [ naga] ( https://github.com/gfx-rs/wgpu/tree/trunk/naga ) on the web
0 commit comments