Skip to content

lto = "fat" causes doctest to generate invalid code for Apple M1 (and potentially x86) #116941

Closed
zama-ai/tfhe-rs
#721
@IceTDrinker

Description

@IceTDrinker

We have this code in our https://github.com/zama-ai/tfhe-rs project on commit f1c21888a762ddf9de017ae52dc120c141ec9c02 from tfhe/docs/how_to/compress.md line 44 and beyond:

use tfhe::prelude::*;
use tfhe::{
    generate_keys, set_server_key, ClientKey, CompressedServerKey, ConfigBuilder, FheUint8,
};

fn main() {
    let config = ConfigBuilder::all_disabled()
        .enable_default_integers()
        .build();

    let cks = ClientKey::generate(config);
    let compressed_sks = CompressedServerKey::new(&cks);

    println!(
        "compressed size  : {}",
        bincode::serialize(&compressed_sks).unwrap().len()
    );

    let sks = compressed_sks.decompress();

    println!(
        "decompressed size: {}",
        bincode::serialize(&sks).unwrap().len()
    );

    set_server_key(sks);

    let clear_a = 12u8;
    let a = FheUint8::try_encrypt(clear_a, &cks).unwrap();

    let c = a + 234u8;
    let decrypted: u8 = c.decrypt(&cks);
    assert_eq!(decrypted, clear_a.wrapping_add(234));
}

I expected to see this happen: running the doctest with the following command should work (note that we modify the release profile to have lto = "fat" enabled):

RUSTFLAGS="-C target-cpu=native" cargo +nightly-2023-10-17 test --profile release --doc --features=aarch64-unix,boolean,shortint,integer,internal-keycache -p tfhe -- test_user_docs::how_to_compress

Instead, this happened: the program crashes, compiling the same code in a separate example and the same cargo configuration results in an executable that works. Turning LTO off also makes a doctest that compiles properly, indicating LTO is at fault or part of the problem when combined with doctests.

It has been happening randomly for doctests on a lot of Rust versions but we could not identify what the issue was, looks like enabling LTO creates a miscompile where a value that is provably 0 (as it's never modified by the code) is asserted to be != 0 and crashes the program, sometimes different things error out, it looks like the program is reading at the wrong location on the stack. The value being asserted != 0 is in https://github.com/zama-ai/tfhe-rs/blob/f1c21888a762ddf9de017ae52dc120c141ec9c02/tfhe/src/core_crypto/algorithms/ggsw_encryption.rs#L551

Unfortunately we are not able to minify this issue at the moment as it's not happening reliably across doctests.

Meta

rustc --version --verbose:

rustc 1.75.0-nightly (49691b1f7 2023-10-16)
binary: rustc
commit-hash: 49691b1f70d71dd7b8349c332b7f277ee527bf08
commit-date: 2023-10-16
host: aarch64-apple-darwin
release: 1.75.0-nightly
LLVM version: 17.0.2

Unfortunately on nightly (used to recover the doctest binaries via RUSTDOCFLAGS="-Z unstable-options --persist-doctests doctestbins") only exhibits the crash for the parallel version of an encryption algorithm used with rayon (on current stable we can also get the crash with a serial algorithm but we don't seem to be able to recover the doctest binary).

doctest_miscompile.zip
The archive contains the objdump --disassemble for the code compiled as an example (running fine) and the code compiled as a doctest exhibiting the miscompilation, if needed I can provide the binaries, but I would understand if nobody would want to run a binary coming from a bug report.

objdump --version
Apple LLVM version 14.0.3 (clang-1403.0.22.14.1)
  Optimized build.
  Default target: arm64-apple-darwin22.5.0
  Host CPU: apple-m1l

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_32 - AArch64 (little endian ILP32)
    aarch64_be - AArch64 (big endian)
    arm        - ARM
    arm64      - ARM64 (little endian)
    arm64_32   - ARM64 (little endian ILP32)
    armeb      - ARM (big endian)
    thumb      - Thumb
    thumbeb    - Thumb (big endian)
    x86        - 32-bit X86: Pentium-Pro and above
    x86-64     - 64-bit X86: EM64T and AMD64

Here is a snippet of a backtrace with two threads erroring on two different issues (while there is no problem having the same code compiled as an example).

Backtrace

stack backtrace:
   0:        0x102712f6c - thread '<<unnamed>std' panicked at tfhe/src/core_crypto/algorithms/ggsw_encryption.rs:551:::5sys_common:
::assertion failed: ciphertext_modulus.is_compatible_with_native_modulus()backtrace
::_print::DisplayBacktrace as core::fmt::Display>::fmt::h06ea57ce7b13512d
   1:        0x10268b4f8 - core::fmt::write::h4d15d254ca20c331
   2:        0x1026c6a68 - std::io::Write::write_fmt::hfdc8b2852a9a03fa
   3:        0x102715ea0 - std::sys_common::backtrace::print::h139bbaa51f48014c
   4:        0x102715a08 - std::panicking::default_hook::{{closure}}::hbbb7d85a61092397
   5:        0x1027157cc - std::panicking::default_hook::hb0db088803baef11
   6:        0x102717234 - std::panicking::rust_panic_with_hook::h78dc274574606137
   7:        0x102716da8 - std::panicking::begin_panic_handler::{{closure}}::h2905be29dbe9281c
   8:        0x102716c88 - std::sys_common::backtrace::__rust_end_short_backtrace::h2a15f4fd2d64df91
   9:        0x102716c7c - _rust_begin_unwind
  10:        0x1027fe624 - core::panicking::panic_fmt::hd8e61ff6f38230f9
  11:        0x1027fe7b0 - core::panicking::panic::h4a945e52b5fb1050
  12:        0x1027990bc - tfhe::core_crypto::algorithms::glwe_encryption::encrypt_seeded_glwe_ciphertext_assign_with_existing_generator::hb32b93df2aa13c6e
  13:        0x1027d8d44 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume_iter::h6b9d6bce496a26b2
  14:        0x10277099c - rayon::iter::plumbing::Producer::fold_with::h3252c105ae5580f0
  15:        0x10278c92c - rayon::iter::plumbing::bridge_producer_consumer::helper::h516df06807eeed76
  16:        0x10271ff70 - rayon_core::join::join_context::{{closure}}::h7ecf44f403b2e94c
  17:        0x102729d00 - rayon_core::registry::in_worker::hb2d005d9f62ec9b8
  18:        0x10278c918 - rayon::iter::plumbing::bridge_producer_consumer::helper::h516df06807eeed76
  19:        0x102792d0c - <<rayon::iter::map::Map<I,F> as rayon::iter::IndexedParallelIterator>::with_producer::Callback<CB,F> as rayon::iter::plumbing::ProducerCallback<T>>::callback::h282ea6fb42ca6c2b
  20:        0x10276aaa0 - <<rayon::iter::zip::Zip<A,B> as rayon::iter::IndexedParallelIterator>::with_producer::CallbackB<CB,A> as rayon::iter::plumbing::ProducerCallback<ITEM>>::callback::h6c6ab19b4791d17e
  21:        0x1027dcc88 - <<rayon::iter::enumerate::Enumerate<I> as rayon::iter::IndexedParallelIterator>::with_producer::Callback<CB> as rayon::iter::plumbing::ProducerCallback<I>>::callback::h62504345ff3d393a
  22:        0x10278f38c - rayon::iter::plumbing::bridge::h142cac5b932df279
  23:        0x1027de84c - rayon::iter::plumbing::Producer::fold_with::hda6c429fb67861a6
  24:        0x10278b204 - rayon::iter::plumbing::bridge_producer_consumer::helper::ha97da0be53d3520b
  25:        0x1027930fc - <<rayon::iter::map::Map<I,F> as rayon::iter::IndexedParallelIterator>::with_producer::Callback<CB,F> as rayon::iter::plumbing::ProducerCallback<T>>::callback::h5caece096ea77aa2
  26:        0x102768cdc - <<rayon::iter::zip::Zip<A,B> as rayon::iter::IndexedParallelIterator>::with_producer::CallbackA<CB,B> as rayon::iter::plumbing::ProducerCallback<ITEM>>::callback::h9c59859a5ada9da8
  27:        0x102790548 - rayon::iter::plumbing::bridge::h691ef483cd06a966
  28:        0x1027d896c - tfhe::core_crypto::algorithms::ggsw_encryption::par_encrypt_constant_seeded_ggsw_ciphertext_with_existing_generator::h1092854bcdddc1c5
  29:        0x1027d8540 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume_iter::h58460779da245a1d
  30:        0x102771604 - rayon::iter::plumbing::Producer::fold_with::h5c2dab692eefc651
  31:        0x10278a424 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  32:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  33:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  34:        0x10271ec34 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  35:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  36:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  37:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  38:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  39:        0x10280004c - rayon_core::join::join_recover_from_panic::hac430d1fb14e684b
  40:        0x10271eb10 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  41:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  42:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  43:        0x10271eac8 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  44:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  45:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  46:        0x1027306d4 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  47:        0x102750400 - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::h5752c5eaefb098bd
  48:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  49:        0x1026a9300 - rayon_core::registry::ThreadBuilder::run::h03f0186f2f91b865
  50:        0x1026b1ee4 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf857650a9dcd5e44
  51:        0x1026ac8c8 - core::ops::function::FnOnce::call_once{{vtable.shim}}::heab0ff5ef27f89d0
  52:        0x1027183c4 - std::sys::unix::thread::Thread::new::thread_start::h2ab8753089ede7d0
  53:        0x19832bfa8 - __pthread_joiner_wake
stack backtrace:
   0:        0x102712f6c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h06ea57ce7b13512d
   1:        0x10268b4f8 - core::fmt::write::h4d15d254ca20c331
   2:        0x1026c6a68 - std::io::Write::write_fmt::hfdc8b2852a9a03fa
   3:        0x102715ea0 - std::sys_common::backtrace::print::h139bbaa51f48014c
   4:        0x102715a08 - std::panicking::default_hook::{{closure}}::hbbb7d85a61092397
   5:        0x1027157cc - std::panicking::default_hook::hb0db088803baef11
   6:        0x102717234 - std::panicking::rust_panic_with_hook::h78dc274574606137
   7:        0x102716da8 - std::panicking::begin_panic_handler::{{closure}}::h2905be29dbe9281c
   8:        0x102716c88 - std::sys_common::backtrace::__rust_end_short_backtrace::h2a15f4fd2d64df91
   9:        0x102716c7c - _rust_begin_unwind
  10:  thread ' <unnamed> ' panicked at  /rustc/49691b1f70d71dd7b8349c332b7f277ee527bf08/library/core/src/num/mod.rs : 1166 :0x51027fe624:
 - attempt to calculate the remainder with a divisor of zerocore
::panicking::panic_fmt::hd8e61ff6f38230f9
  11:        0x1027fe7b0 - core::panicking::panic::h4a945e52b5fb1050
  12:        0x1027990bc - tfhe::core_crypto::algorithms::glwe_encryption::encrypt_seeded_glwe_ciphertext_assign_with_existing_generator::hb32b93df2aa13c6e
  13:        0x1027d8d44 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume_iter::h6b9d6bce496a26b2
  14:        0x10277099c - rayon::iter::plumbing::Producer::fold_with::h3252c105ae5580f0
  15:        0x10278c92c - rayon::iter::plumbing::bridge_producer_consumer::helper::h516df06807eeed76
  16:        0x102756c50 - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::hb4b2cce923b187bc
  17:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  18:        0x10280004c - rayon_core::join::join_recover_from_panic::hac430d1fb14e684b
  19:        0x10271eb10 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  20:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  21:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  22:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  23:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  24:        0x10280004c - rayon_core::join::join_recover_from_panic::hac430d1fb14e684b
  25:        0x10271eb10 - rayon_core::join::join_context::{{closure}}::h6ff07f0ad22d988f
  26:        0x1027292dc - rayon_core::registry::in_worker::h72ac659d0872c7bc
  27:        0x10278a410 - rayon::iter::plumbing::bridge_producer_consumer::helper::hd7e30ce6b8c8fdf8
  28:        0x102759bec - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::he14a52c10f982320
  29:        0x1027ff980 - rayon_core::registry::WorkerThread::wait_until_cold::hadf889fe03869109
  30:        0x1026a9300 - rayon_core::registry::ThreadBuilder::run::h03f0186f2f91b865
  31:        0x1026b1ee4 - std::sys_common::backtrace::__rust_begin_short_backtrace::hf857650a9dcd5e44
  32:        0x1026ac8c8 - core::ops::function::FnOnce::call_once{{vtable.shim}}::heab0ff5ef27f89d0
  33:        0x1027183c4 - std::sys::unix::thread::Thread::new::thread_start::h2ab8753089ede7d0
  34:        0x19832bfa8 - __pthread_joiner_wake

We have also seen some flaky doctests on x86_64 and could not narrow down the issue, we have turned off LTO for all of our doctests for now and we will monitor how things evolve, the reason for the suspicion of an issue on x86 as well is that M1 builds have been running with LTO off for months and have never exhibited the flaky doctest we saw on x86_64, though given the compiled code in that case is significantly different (intrinsics usage being one factor) we can't yet be sure a similar issue is happening on x86_64.

Cheers

Metadata

Metadata

Assignees

Labels

A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.E-needs-mcveCall for participation: This issue has a repro, but needs a Minimal Complete and Verifiable ExampleI-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessO-AArch64Armv8-A or later processors in AArch64 modeP-highHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions