Skip to content

BLAKE3 hash support "portable" #6393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

divinity76
Copy link
Contributor

@divinity76 divinity76 commented Oct 30, 2020

this PR is an alternative to #6358
with some notable differences,

  • this PR is roughly 1700 SLOC, compared to BLAKE3 hash support #6358 at roughly 31,000 SLOC
  • this PR only bundles the "portable C" version of BLAKE3, it does not contain any of the SSE/AVX/NEON cpu-optimized non-portable implementations (hence the huge SLOC difference)

the idea here is that just having BLAKE3 supported in PHP, even if it's not the most performant implementation, is better than no support at all,

unsurprisingly, the portable version of BLAKE3 isn't nearly as fast as in #6358 , being beaten by MD4 (still beats MD5 though), but it still does a pretty good job in /ext/hash/bech.php, being roughly 1.9 times faster than sha3-224 on my laptop (compared to 4.2 times faster in #6358 ), and being the fastest secure hash on the list,

hans@xDevAd:~/projects/php-src/ext/hash$ ../../sapi/cli/php bench.php 2240
crc32b       0.001342
crc32c       0.001426
crc32        0.001652
adler32      0.012001
fnv132       0.021862
fnv164       0.022513
fnv1a64      0.022540
md4          0.023295
fnv1a32      0.025501
joaat        0.028628
blake3       0.029989
tiger192,3   0.030417
tiger128,3   0.031244
md5          0.031678
tiger160,3   0.031772
tiger128,4   0.038958
sha1         0.039075
tiger192,4   0.039367
tiger160,4   0.041979
sha3-224     0.058326
sha3-256     0.061110
ripemd128    0.066177
ripemd256    0.066378
sha512/224   0.067940
sha512       0.068035
haval224,3   0.069324
sha384       0.069505
haval256,3   0.069509
haval192,3   0.070219
haval160,3   0.070840
sha512/256   0.071555
haval128,3   0.072177
sha3-384     0.080816
ripemd160    0.089877
ripemd320    0.092976
haval128,4   0.097042
haval160,4   0.097483
haval192,4   0.097524
haval224,4   0.097618
haval256,4   0.099826
sha256       0.106817
sha224       0.107578
sha3-512     0.115257
haval160,5   0.122693
haval128,5   0.125198
haval192,5   0.125270
haval224,5   0.125790
haval256,5   0.127719
whirlpool    0.167175
gost-crypto  0.292890
gost         0.293071
snefru256    0.592855
snefru       0.599397
md2          2.154187

now my idea was that users who want the SSE/AVX/neon optimized hashes compiled in can compile php with ./configure --with-blake3-upstream-c-source-dir=DIR , where the SSE/AVX/etc optimized implementations can be found, .. but i have been unable to figure out how to tell /ext/hash/config.m4 "these paths are ABSOLUTE paths, not relative paths", so when i currently try

time /bin/sh -c './buildconf && ./configure --with-blake3-upstream-c-source-dir=/home/hans/projects/BLAKE3/c --disable-all --disable-cgi --enable-cli --enable-debug && make clean && make -j4'

it ends with:

/bin/bash /home/hans/projects/php-src/libtool --silent --preserve-dup-deps --mode=compile cc -Iext/hash/ -I/home/hans/projects/php-src/ext/hash/ -I/home/hans/projects/php-src/include -I/home/hans/projects/php-src/main -I/home/hans/projects/php-src -I/home/hans/projects/php-src/ext/date/lib -I/home/hans/projects/php-src/TSRM -I/home/hans/projects/php-src/Zend    -fno-common -Wformat-truncation -Wlogical-op -Wduplicated-cond -Wno-clobbered -Wall -Wextra -Wno-strict-aliasing -Wno-implicit-fallthrough -Wno-unused-parameter -Wno-sign-compare -g -fvisibility=hidden -O0 -DZEND_SIGNALS   -I/home/hans/projects/php-src/ext/hash/sha3/generic64lc -DKeccakP200_excluded -DKeccakP400_excluded -DKeccakP800_excluded -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1 -I/home/hans/projects/BLAKE3/c -c /home/hans/projects/php-src/ext/hash/hash_blake3.c -o ext/hash/hash_blake3.lo 
make: *** No rule to make target '/home/hans/projects/php-src/ext/hash//home/hans/projects/BLAKE3/c/blake3.c', needed by 'ext/hash//home/hans/projects/BLAKE3/c/blake3.lo'.  Stop.

if anyone has any idea how to make ./configure --with-blake3-upstream-c-source-dir=DIR actually work, please let me know

this time i'm only importing the portable version..
this will result in decreased performance, obiously..

if i'm importing the assembly optimized versions, it will be some ~30,000 lines ( see php#6358 )
this is a significantly smaller alternative to the PR at php#6358 ,
now blake3 "portable c" version is bundled (specifically version 0.3.7 plus a few patches that will be part of the 0.3.8 release..), and ./configure supports a new optional --with-blake3-upstream-c-source-dir=DIR argument for specifying the location of the upstream BLAKE3 C implementation, if invoked, the SSE2/SSE4.1/AVX2/AVX512 optimized versions of BLAKE3 will be compiled in when applicable (this has not been added to MSVC, i don't know how to do it on MSVC, and i don't have a MSVC system to figure it out out on, if someone think getting those optimizations available on MSVC is important, take it up with the windows php mailing list.. just getting the portable version to compile on MSVC was good enough for me.)

also userland scripts can detect at runtime if the portable version or the upstream version, of BLAKE was compiled by consulting phpinfo(), it will either say "blake3 implementation: portable 0.3.7" or "blake3 implementation: upstream X.X.X"
@divinity76 divinity76 marked this pull request as ready for review October 31, 2020 01:10
@testAccountDeltas
Copy link

Hello All. BIG Fix

/* 15. Free Willy (here be crashes) */

zend_init_interned_strings_ht(&CG(interned_strings), 0);

To
zend_init_interned_strings_ht(&CG(interned_strings), 1);

There is no more crash.

@divinity76
Copy link
Contributor Author

@divinity76 divinity76 mentioned this pull request Oct 31, 2020
@testAccountDeltas
Copy link

@testAccountDeltas you're making no sense..
https://getyarn.io/yarn-clip/dca4e3fd-12d5-4dd3-9998-f99682bd471f

I announced how to get rid of crashes, with the impossibility of freeing the pointer for a hash table. Which led to
:777c8786 ntdll.RtlFreeHeap + 0x46
:777c8786 ntdll.RtlFreeHeap + 0x46
:77663c9b ucrtbase._free_base + 0x1b
:77663c68 ucrtbase.free + 0x18
:66fa2e52 ; php8ts.dll
:75ff51ec

@divinity76
Copy link
Contributor Author

@testAccountDeltas

I announced how to get rid of crashes, with the impossibility of freeing the pointer for a hash table. Which led to

do you mean that on windows, it crash when trying to free the hash ram?
the memory itself is supposed to be allocated on line 377 here

ops->hash_init(context);

and it's supposed to be freed on line 397 here

efree(context);

if that is crashing or leaking memory, it probably means i got this wrong:

// help: is V correct?
#define PHP_BLAKE3_SPEC "b8b8qb64bbbbb1760"

in ext/hash/php_hash_blake3.h at
https://github.com/php/php-src/pull/6393/files#diff-e70eb31559ac252e015177c3b5b53f974c434718c3cd368278592095aa89d11fR11-R12

hmm, maybe b8b8qb64bbbbb1760 is wrong, but then i'm not sure what it's supposed to be, anyone?

@testAccountDeltas
Copy link

testAccountDeltas commented Nov 1, 2020

@testAccountDeltas

I announced how to get rid of crashes, with the impossibility of freeing the pointer for a hash table. Which led to

do you mean that on windows, it crash when trying to free the hash ram?

Crash happens even under Linux if you run the debugger. And you are trying to execute a large script that operates on strings.

Crash occurs for the CG hash table (interned_strings)

Sorry, but I am not good at reporting errors.

@Gauss23
Copy link

Gauss23 commented Nov 27, 2020

I like the idea of having Blake3 hashing implemented. Maybe think about file hashing, too. md5_file() and sha1_file() are really outdated. Better than the need to call b3sum on the command line:
https://github.com/BLAKE3-team/BLAKE3

@k0001
Copy link

k0001 commented Nov 29, 2020

What would be necessary to have support for the BLAKE3 keyed hash function (blake3_hasher_init_keyed) and BLAKE3 key derivation function (blake3_hasher_init_derive_key_raw) exported as well?

For context: The BLAKE3 keyed hash function is generally used as a replacement for HMAC or similar, and the BLAKE3 key derivation function is generally used as a replacement for HKDF or similar.

@divinity76
Copy link
Contributor Author

divinity76 commented Nov 29, 2020

@k0001 for blake3_hasher_init_keyed i'm guessing it would require a new HASH_BLAKE3_MAC flag for hash_init() (or perhaps even just a generic HASH_MAC flag? which throws an exception if one tries to use it with any algorithm not sporting a native keyed mode?) but..

this patch already supports HMAC with blake3 like

hash_init ( "blake3", HASH_HMAC, $key = "whatever" )

and sure HMAC is slower than just using blake3's native keyed mode, but.. i imagine that using HMAC with blake3 is close enough for most projects, i won't bother trying to get native blake3 keyed mode supported

as for blake3_hasher_init_derive_key_raw ... i don't know. maybe a brand new hash_derive_key() function? i would try asking @sgolemon

@k0001
Copy link

k0001 commented Nov 29, 2020

@divinity76 thank you for your quick reply.

Notice that while HMAC and HKDF are similar in purpose to BLAKE3's functions, these are not compatible algorithms. Their outputs are different. If a particular algorithm or third party project expects the BLAKE3 keyed hash of some input, or maybe it expects us to perform a BLAKE3 key derivation, then that's what one needs to do. Having HMAC or HKDF support just won't be enough. This is unrelated to performance matters, it's a matter of compatibility.

For some context: I'm unfamiliar with PHP, but at the moment I'm looking at implement some code that would need to perform these BLAKE3 functions. After searching online for BLAKE3 support in PHP, I came across this issue. HMAC and HKDF are not enough for my needs, since the third party project I'm trying to integrate doesn't use those algorithms.

Regarding hash_init dealing with that HASH_HMAC flag: I'm unfamiliar with this is usually done in PHP, but I suppose there could be a HASH_BLAKE3_KEYED as well. The key is always 32 bytes long.

Regarding a potential hash_derive_key(): I don't know if trying to generalize this makes a lot of sense. For example, whereas HKDF takes an arbitrary length salt, an arbitrary length derivation context, and arbitrary length input key material, the BLAKE3 key derivation function only takes the arbitrary length context and the arbitrary length input key material. That is, no salt.

Notice that whether BLAKE3 hashes in normal mode, keyed mode or key derivation mode, are all BLAKE3 initialization parameters. So, presumably, all of this could be handled by hash_init. Perhaps there could be yet another flag, say HASH_BLAKE3_DERIVE_KEY, that takes the context as input, similarly to how the HASH_BLAKE3_KEYED flag suggested above takes the key. The update and finalize functions will continue to work just fine, they don't need to be modified.

I'm happy to implement this myself.

@ramsey ramsey added the Feature label Jun 9, 2021
@Girgias
Copy link
Member

Girgias commented Jan 25, 2023

Closing as stale

@Girgias Girgias closed this Jan 25, 2023
@php php deleted a comment from smokefire69 Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants