Fix Keccak dispatch race and cross-backend split#2426
Open
dstebila wants to merge 40 commits into
Open
Conversation
Copied from slh dsa commit succesfully
Info: Updating KAT for HQC-128
Error: Cannot find location clean in upstream list
[{'name': 'ref', 'version': '2025-08-22', 'folder_name': '.', 'compile_opts': '-DHQC_ARCH_REF=1', 'signature_keypair': 'crypto_kem_keypair', 'signature_enc': 'crypto_kem_enc', 'signature_dec': 'crypto_kem_dec', 'sources': 'src/ref/parsing.c src/ref/gf.c src/ref/vector.c src/ref/gf2x.c src/ref/reed_solomon.c src/ref/reed_muller.c src/ref/hqc.c src/ref/vector.h src/ref/hqc-1/reed_solomon.h src/ref/hqc-1/parameters.h src/ref/hqc.h src/ref/parsing.h src/ref/data_structures.h src/ref/gf.h src/ref/gf2x.h'}]
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Running "tests/test_kem HQC-128" results in segmentation fault.
Build command:
cmake -GNinja \
-DOQS_MINIMAL_BUILD="KEM_hqc_128;KEM_hqc_192;KEM_hqc_256" \
..
ninja
Removed <immintrin.h> from ref implementations; replaced fips202.c with
OQS_SHA3. liboqs can build, but no test cases, no NIST KAT, no test
vectors
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
tests/test_kem "HQC-128" still fails with segmentation fault: Process 3191 stopped * thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0xc8) frame #0: 0x0000000100035088 test_kem`keccak_inc_squeeze(h="", outlen=16, s=0x0000000000000000, r=136) at xkcp_sha3.c:189:18 186 **************************************************/ 187 static void keccak_inc_squeeze(uint8_t *h, size_t outlen, 188 uint64_t *s, uint32_t r) { -> 189 while (outlen > s[25]) { 190 (*Keccak_ExtractBytes_ptr)(s, h, (unsigned int)(r - s[25]), (unsigned int)s[25]); 191 (*Keccak_Permute_ptr)(s); 192 h += s[25]; Target 0: (test_kem) stopped. * thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0xc8) * frame #0: 0x0000000100035088 test_kem`keccak_inc_squeeze(h="", outlen=16, s=0x0000000000000000, r=136) at xkcp_sha3.c:189:18 frame #1: 0x0000000100034d3c test_kem`SHA3_shake256_inc_squeeze(output="", outlen=16, state=0x0000000100048298) at xkcp_sha3.c:382:2 frame #2: 0x00000001000358dc test_kem`OQS_SHA3_shake256_inc_squeeze(output="", outlen=16, state=0x0000000100048298) at sha3.c:148:2 frame #3: 0x0000000100007924 test_kem`prng_get_bytes(output="", outlen=16) at symmetric.c:45:5 frame #4: 0x0000000100005a54 test_kem`crypto_kem_enc(c_kem="", K="", ek_kem="'x\U00000015\x88\xf6\U0000000f\xf0\xce\U000000101\a፵e\xfe\x92~\r\xb7sSZ\xc4\U0000000e\xd2\xda\xd9%պ\x8e\xee-\xf9\xe07<@v\n\xe6w\x8d?\xe6B\xb7\xb9X\x8b\xae\x86nr@\xcdr!\xd3\U00000002'+[\U00000011\U0000001f\xb0\x8e\xcc\xd0Zl\xd3\xd8-\U00000001Im\xfft\x9b\x9d]\xa2\xd2Nu0\x83\xce}D$\\\xe8o,\x99\x99D\x8c\xb4\U00000016\xb5\xb5\xbd\xfaĸkA Ao\x94」Yb\a\xcfa\U0000001dQ:\U00000018\xe5\x8cI\xbd/\xd9U\xcb\xe9\U00000019\xb9JD\xfc\x86^\xf6W\U00000016\xa9xz3G\x95\x97Hs5\xc2\U00000001I\xa3\x9c\xd7\xf2\U0000000f\xf8\U00000005\xeb\x92{\xfc\x9fa\U00000018O\x96\xf8\xa7tH[") at kem.c:123:5 frame #5: 0x000000010000306c test_kem`OQS_KEM_hqc_128_encaps(ciphertext="", shared_secret="", public_key="'x\U00000015\x88\xf6\U0000000f\xf0\xce\U000000101\a፵e\xfe\x92~\r\xb7sSZ\xc4\U0000000e\xd2\xda\xd9%պ\x8e\xee-\xf9\xe07<@v\n\xe6w\x8d?\xe6B\xb7\xb9X\x8b\xae\x86nr@\xcdr!\xd3\U00000002'+[\U00000011\U0000001f\xb0\x8e\xcc\xd0Zl\xd3\xd8-\U00000001Im\xfft\x9b\x9d]\xa2\xd2Nu0\x83\xce}D$\\\xe8o,\x99\x99D\x8c\xb4\U00000016\xb5\xb5\xbd\xfaĸkA Ao\x94」Yb\a\xcfa\U0000001dQ:\U00000018\xe5\x8cI\xbd/\xd9U\xcb\xe9\U00000019\xb9JD\xfc\x86^\xf6W\U00000016\xa9xz3G\x95\x97Hs5\xc2\U00000001I\xa3\x9c\xd7\xf2\U0000000f\xf8\U00000005\xeb\x92{\xfc\x9fa\U00000018O\x96\xf8\xa7tH[") at kem_hqc_128.c:61:22 frame #6: 0x00000001000374dc test_kem`OQS_KEM_encaps(kem=0x0000000100aed6e0, ciphertext="", shared_secret="", public_key="'x\U00000015\x88\xf6\U0000000f\xf0\xce\U000000101\a፵e\xfe\x92~\r\xb7sSZ\xc4\U0000000e\xd2\xda\xd9%պ\x8e\xee-\xf9\xe07<@v\n\xe6w\x8d?\xe6B\xb7\xb9X\x8b\xae\x86nr@\xcdr!\xd3\U00000002'+[\U00000011\U0000001f\xb0\x8e\xcc\xd0Zl\xd3\xd8-\U00000001Im\xfft\x9b\x9d]\xa2\xd2Nu0\x83\xce}D$\\\xe8o,\x99\x99D\x8c\xb4\U00000016\xb5\xb5\xbd\xfaĸkA Ao\x94」Yb\a\xcfa\U0000001dQ:\U00000018\xe5\x8cI\xbd/\xd9U\xcb\xe9\U00000019\xb9JD\xfc\x86^\xf6W\U00000016\xa9xz3G\x95\x97Hs5\xc2\U00000001I\xa3\x9c\xd7\xf2\U0000000f\xf8\U00000005\xeb\x92{\xfc\x9fa\U00000018O\x96\xf8\xa7tH[") at kem.c:653:10 frame #7: 0x00000001000014fc test_kem`kem_test_correctness(method_name="HQC-128", derand=false) at test_kem.c:253:8 frame #8: 0x0000000100000e94 test_kem`test_wrapper(arg=0x000000016fdfe3b8) at test_kem.c:378:11 frame #9: 0x000000018d86bc58 libsystem_pthread.dylib`_pthread_start + 136 Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
======================================= short test summary info ======================================== FAILED tests/test_binary.py::test_namespace - AssertionError: assert 129 == 0 FAILED tests/test_code_conventions.py::test_style - AssertionError: Got unexpected return code 255 FAILED tests/test_code_conventions.py::test_spdx - assert False FAILED tests/test_kat.py::test_kem[HQC-128] - AssertionError: assert 'xxx' == '1706aa30e811...1fa4b654b2b8e' FAILED tests/test_kat.py::test_kem[HQC-192] - AssertionError: assert 'xxx' == '5e64abaac8f9...e8e610db31188' FAILED tests/test_kat.py::test_kem[HQC-256] - AssertionError: assert 'xxx' == '6d0c22e3c0f0...a8ff5ce8df327' FAILED tests/test_kat_all.py::test_kem[HQC-128] - AssertionError: assert '2d566f0a2743...c6b425f86a116' == 'f604e7edaa4c...007ce066fcd75' FAILED tests/test_kat_all.py::test_kem[HQC-192] - AssertionError: assert '1b0137144f71...39ee740cbbc83' == '4063a6079ded...543c4acd5cb80' FAILED tests/test_kat_all.py::test_kem[HQC-256] - AssertionError: assert 'c13dcd7fa9ec...12232a6fde81e' == '7ee833200a2e...0128d28202ca3' ======================= 9 failed, 274 passed, 4431 skipped in 147.34s (0:02:27) ======================= Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Also: add newline at the end of kats.json Now KATs all pass for HQC-1/3/5. Need to work on namespacing next. Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
For the following symbols, either change to static or add prefix . > nm -g build/lib/liboqs.a Non-namespaced symbol: _code_decode Non-namespaced symbol: _code_encode Non-namespaced symbol: _fft Non-namespaced symbol: _fft_retrieve_error_poly Non-namespaced symbol: _gf_carryless_mul Non-namespaced symbol: _gf_generate Non-namespaced symbol: _gf_inverse Non-namespaced symbol: _gf_mul Non-namespaced symbol: _gf_square Non-namespaced symbol: _vect_mul Non-namespaced symbol: _hqc_pke_decrypt Non-namespaced symbol: _hqc_pke_encrypt Non-namespaced symbol: _hqc_pke_keygen Non-namespaced symbol: _PQCHQC_C_crypto_kem_dec Non-namespaced symbol: _PQCHQC_C_crypto_kem_enc Non-namespaced symbol: _PQCHQC_C_crypto_kem_keypair Non-namespaced symbol: _hqc_c_kem_from_string Non-namespaced symbol: _hqc_c_kem_to_string Non-namespaced symbol: _hqc_dk_pke_from_string Non-namespaced symbol: _hqc_ek_pke_from_string Non-namespaced symbol: _encode Non-namespaced symbol: _expand_and_sum Non-namespaced symbol: _find_peaks Non-namespaced symbol: _hadamard Non-namespaced symbol: _reed_muller_decode Non-namespaced symbol: _reed_muller_encode Non-namespaced symbol: _compute_generator_poly Non-namespaced symbol: _reed_solomon_decode Non-namespaced symbol: _reed_solomon_encode Non-namespaced symbol: _prng_get_bytes Non-namespaced symbol: _prng_init Non-namespaced symbol: _xof_get_bytes Non-namespaced symbol: _xof_init Non-namespaced symbol: _vect_add Non-namespaced symbol: _vect_compare Non-namespaced symbol: _vect_generate_random_support1 Non-namespaced symbol: _vect_generate_random_support2 Non-namespaced symbol: _vect_print Non-namespaced symbol: _vect_sample_fixed_weight1 Non-namespaced symbol: _vect_sample_fixed_weight2 Non-namespaced symbol: _vect_set_random Non-namespaced symbol: _vect_truncate Non-namespaced symbol: _vect_write_support_to_vector Non-namespaced symbol: _code_decode Non-namespaced symbol: _code_encode Non-namespaced symbol: _fft Non-namespaced symbol: _fft_retrieve_error_poly Non-namespaced symbol: _gf_carryless_mul Non-namespaced symbol: _gf_generate Non-namespaced symbol: _gf_inverse Non-namespaced symbol: _gf_mul Non-namespaced symbol: _gf_square Non-namespaced symbol: _vect_mul Non-namespaced symbol: _hqc_pke_decrypt Non-namespaced symbol: _hqc_pke_encrypt Non-namespaced symbol: _hqc_pke_keygen Non-namespaced symbol: _PQCHQC_C_crypto_kem_dec Non-namespaced symbol: _PQCHQC_C_crypto_kem_enc Non-namespaced symbol: _PQCHQC_C_crypto_kem_keypair Non-namespaced symbol: _hqc_c_kem_from_string Non-namespaced symbol: _hqc_c_kem_to_string Non-namespaced symbol: _hqc_dk_pke_from_string Non-namespaced symbol: _hqc_ek_pke_from_string Non-namespaced symbol: _encode Non-namespaced symbol: _expand_and_sum Non-namespaced symbol: _find_peaks Non-namespaced symbol: _hadamard Non-namespaced symbol: _reed_muller_decode Non-namespaced symbol: _reed_muller_encode Non-namespaced symbol: _compute_generator_poly Non-namespaced symbol: _reed_solomon_decode Non-namespaced symbol: _reed_solomon_encode Non-namespaced symbol: _prng_get_bytes Non-namespaced symbol: _prng_init Non-namespaced symbol: _xof_get_bytes Non-namespaced symbol: _xof_init Non-namespaced symbol: _vect_add Non-namespaced symbol: _vect_compare Non-namespaced symbol: _vect_generate_random_support1 Non-namespaced symbol: _vect_generate_random_support2 Non-namespaced symbol: _vect_print Non-namespaced symbol: _vect_sample_fixed_weight1 Non-namespaced symbol: _vect_sample_fixed_weight2 Non-namespaced symbol: _vect_set_random Non-namespaced symbol: _vect_truncate Non-namespaced symbol: _vect_write_support_to_vector Non-namespaced symbol: _code_decode Non-namespaced symbol: _code_encode Non-namespaced symbol: _fft Non-namespaced symbol: _fft_retrieve_error_poly Non-namespaced symbol: _gf_carryless_mul Non-namespaced symbol: _gf_generate Non-namespaced symbol: _gf_inverse Non-namespaced symbol: _gf_mul Non-namespaced symbol: _gf_square Non-namespaced symbol: _vect_mul Non-namespaced symbol: _hqc_pke_decrypt Non-namespaced symbol: _hqc_pke_encrypt Non-namespaced symbol: _hqc_pke_keygen Non-namespaced symbol: _PQCHQC_C_crypto_kem_dec Non-namespaced symbol: _PQCHQC_C_crypto_kem_enc Non-namespaced symbol: _PQCHQC_C_crypto_kem_keypair Non-namespaced symbol: _hqc_c_kem_from_string Non-namespaced symbol: _hqc_c_kem_to_string Non-namespaced symbol: _hqc_dk_pke_from_string Non-namespaced symbol: _hqc_ek_pke_from_string Non-namespaced symbol: _encode Non-namespaced symbol: _expand_and_sum Non-namespaced symbol: _find_peaks Non-namespaced symbol: _hadamard Non-namespaced symbol: _reed_muller_decode Non-namespaced symbol: _reed_muller_encode Non-namespaced symbol: _compute_generator_poly Non-namespaced symbol: _reed_solomon_decode Non-namespaced symbol: _reed_solomon_encode Non-namespaced symbol: _prng_get_bytes Non-namespaced symbol: _prng_init Non-namespaced symbol: _xof_get_bytes Non-namespaced symbol: _xof_init Non-namespaced symbol: _vect_add Non-namespaced symbol: _vect_compare Non-namespaced symbol: _vect_generate_random_support1 Non-namespaced symbol: _vect_generate_random_support2 Non-namespaced symbol: _vect_print Non-namespaced symbol: _vect_sample_fixed_weight1 Non-namespaced symbol: _vect_sample_fixed_weight2 Non-namespaced symbol: _vect_set_random Non-namespaced symbol: _vect_truncate Non-namespaced symbol: _vect_write_support_to_vector Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
However, incorrect namespacing may have caused some symbols from HQC-1 to pollute symbols that should have belonged to HQC-3 and HQC-5 Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
…ded tests] Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
…xtended tests] Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
On certain platforms, "unsigned long" is 32-bit wide, so (1UL << x) can overflow if x is greater than 31, such as with HQC-5. Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
WHY WHY WHY what is even the difference???? Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
For liboqs integration, hash_g implementation has been modified There are some unexpected dispatch errors under liboqs/src/common/sha3 that causes intermittent GitHub Action test failures. Due to the difficulty of debugging on GitHub action, I chose this shortcut of replacing incremental API with a single hash over a continuous strip of memory. There might be performance and stack memory usage penalty associated with this approach, but I deemed it acceptable. we need to eventually figure out what happened with Keccak dispatch and restore the usage of incremental API Signed-off-by: Ganyu (Bruce) Xu <g66xu@uwaterloo.ca>
This reverts commit 955e000. Signed-off-by: Douglas Stebila <dstebila@uwaterloo.ca>
Keccak_Dispatch() in xkcp_sha3.c overwrote the global sha3_default_callbacks struct field-by-field on first SHA3 call when AVX512VL was available. Two problems: 1. The struct copy is non-atomic and races against concurrent SHA3 callers on other threads, who can observe a partially-overwritten vtable. 2. The very call that triggered dispatch had already entered through the xkcp _inc_init. After the swap, subsequent _inc_absorb / _inc_finalize on that same context route through the newly-installed AVX512VL table, so one in-flight context is initialized by one backend and absorbed/ finalized by another. Move the top-level backend selection out of Keccak_Dispatch and into a one-time init in sha3.c / sha3x4.c that swaps the 'callbacks' POINTER (single aligned-pointer store) via pthread_once. Keccak_Dispatch still sets the xkcp-internal function pointers (Keccak_*_ptr) for its own use, but no longer touches the top-level table. Signed-off-by: Douglas Stebila <dstebila@uwaterloo.ca>
5e23d5d to
6c93505
Compare
sha3_avx512vl_callbacks and sha3_x4_avx512vl_callbacks are declared const; storing them in a non-const callbacks pointer required a const-stripping cast that fails under -Werror -Wcast-qual in the fuzzing build. Store the dispatch pointer as const internally; the public API signature is unchanged (non-const input is fine to assign into a const-pointer slot). Try to fix https://github.com/open-quantum-safe/liboqs/actions/runs/25812670070/job/75833444016?pr=2426 Signed-off-by: Douglas Stebila <dstebila@uwaterloo.ca>
Signed-off-by: Douglas Stebila <dstebila@uwaterloo.ca>
The partial-fill-and-permute path in the sha3_absorb macro consumed
\`capacity\` bytes via keccak_1600_partial_add (which advances arg2 and
clobbers %r12) but did not decrement the remaining message-length
tracker. The post-permute path at label 3 then treated the original
mlen as still pending, causing the bytes already consumed in the
partial fill to be 're-absorbed' (reading past the end of the input
buffer) into the post-permute state and leaving s[25] off by
\`capacity\`. Symptom: SHA3-{256,384,512} digests are wrong whenever
an incremental absorb call exactly fills, or crosses, the rate
boundary - reproduced via HQC-3 / HQC-5 KAT failures on AVX512
runners (HQC-1 is unaffected because its hash_g input is 65 bytes,
which never crosses the 72-byte SHA3-512 rate).
The matching shake_absorb macro in the same file already has the
correct pattern (subq %r12, arg3 before the partial_add call); this
just mirrors it in sha3_absorb.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Douglas Stebila <dstebila@uwaterloo.ca>
a6b1f68 to
f7742f4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Claude Code thinks that this might be the cause of the intermittent SHA3 failures. The suggestion and fix seem plausible to me but I don't have access to an AVX512 machine to test it out.
Claude's report: