Skip to content

Conversation

@jedisct1
Copy link
Contributor

@jedisct1 jedisct1 commented Oct 15, 2025

Allows BLAKE3 to be computed using multiple threads.

Benchmarks (from https://github.com/jedisct1/zig-kangarootwelve):

Apple M1

==========================================================================================================
SUMMARY TABLE - All throughput values in MB/s
==========================================================================================================
Size         Chunks |      SHA256      BLAKE3  BLAKE3-Par  TurboSH128   KT128-Seq   KT128-Par
---------- -------- + ----------- ----------- ----------- ----------- ----------- -----------
64 B              1 |     1172.13      574.66       60.64      506.88      356.59      356.00
1 KB              1 |     2179.96      747.71      440.60     1262.64      598.69      598.90
8 KB              1 |     2294.30     1465.83     1241.32     1452.02     1337.35     1150.79
64 KB             8 |     2311.54     1508.80     1471.13     1464.32     1713.64     1665.33
1 MB            128 |     2309.67     1511.87     1504.40     1453.29     2572.95     2561.34
10 MB          1280 |     2307.63     1509.54     5229.25     1442.97     2626.67     9356.03
100 MB        12800 |     2310.07     1508.31     7632.71     1443.38     2643.04    12152.51
200 MB        25600 |     2311.98     1509.60     8419.46     1443.25     2601.17    13479.36
==========================================================================================================

AMD Zen4

==========================================================================================================
SUMMARY TABLE - All throughput values in MB/s
==========================================================================================================
Size         Chunks |      SHA256      BLAKE3  BLAKE3-Par  TurboSH128   KT128-Seq   KT128-Par
---------- -------- + ----------- ----------- ----------- ----------- ----------- -----------
64 B              1 |      878.24      523.53       97.42      395.50      293.91      295.34
1 KB              1 |     1486.90      720.74      521.66      931.96      477.53      478.87
8 KB              1 |     1553.39     3691.62     2924.73     1070.49      993.06      919.36
64 KB             8 |     1566.99     5020.52     4800.96     1075.00     1681.94     1656.39
1 MB            128 |     1565.47     5133.86     5113.38     1073.80     4219.76     4204.44
10 MB          1280 |     1561.68     5120.92     9344.03     1074.22     4627.68    11656.27
100 MB        12800 |     1563.46     3481.63    14390.99     1074.64     4560.84    24914.64
200 MB        25600 |     1563.00     3380.68    16670.07     1075.43     4557.86    26870.09
==========================================================================================================

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some namespacing tips.

Also, a request: could you wait until I land #25592 (working on it) and then accept an Io parameter for doing the asynchronous work? I can help with this.

@jedisct1
Copy link
Contributor Author

Sound good! Let's wait for the Io interface first. Ditto for KT128/KT256.

* master:
  fix typo in std.debug.ElfFile.loadSeparateDebugFile
  Revert "ci: stop building FreeBSD module tests on x86_64-linux"
  Io: fix some horrible data races and UAFs caused by `Condition` misuse
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only had time for a short review but I found one issue:

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - this is the only requested change

@jedisct1 jedisct1 merged commit d5585bc into ziglang:master Nov 1, 2025
9 checks passed
jedisct1 added a commit to jedisct1/zig that referenced this pull request Nov 1, 2025
* master:
  Implement threaded BLAKE3 (ziglang#25587)
  std: Skip element comparisons if `mem.order` args point to same memory
  std.Target: bump vulkan max version to 1.4.331
  std.Target: bump opencl/nvcl max version to 3.0.19
  std.Target: bump cuda max version to 13.0.2
  std.Target: bump amdhsa max version to 7.1.0
  std.Target: bump wasi max version to 0.3.0
  std.Target: bump dragonfly max version to 6.4.2
  std.Target: bump linux max version to 6.17
  std.Target: bump fuchsia max version to 28.0.0
  std.Target: bump contiki max version to 5.1.0
  test: remove some unsupported x86_64 darwin targets from llvm_targets
  std.os.windows: eliminate forwarder function in kernel32 (ziglang#25766)
@jedisct1 jedisct1 deleted the parallelblake3 branch November 1, 2025 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants