Skip to content

Miscompilation of lbzip2 after loop-vectorize pass for avx512 #87189

Closed
@AngryLoki

Description

@AngryLoki

Hi, it was originally reported in https://bugs.gentoo.org/910438 that lbzip2 compiled by clang for AVX-512 platforms produces corrupted bz2 files.

I tried to check with the following results:

  1. after adding -fsanitize=undefined,address issue fully disappears (i. e. produced files are correct and no complaints from sanitizers).
  2. minimal flags to reproduce with clangs are -mavx512f -fvectorize -O1.
  3. With OptBisect I see that the pass, which breaks code is LoopVectorizePass. If I compile all C files to LLVM bc with -O0 and apply opt -passes=loop-vectorize -mcpu=znver4 to encode.c, resulting bz2 files are corrupted:
clang -c -emit-llvm -mavx512f -O1 -DHAVE_CONFIG_H -Isrc -Ilib src/encode.c -o src/encode.o
opt -passes=loop-vectorize -mcpu=znver4 src/encode.o -o src/encode.opt.o
  1. if I set __attribute__ ((optnone)) to all functions in encode.c, code is correct. To produce the issue, it is enough to apply optimizations only to assign_codes and generate_prefix_code.
  2. no issues with clang/avx256 or gcc/avx512

I attach partially compiled lbzip2.bc and testing compressme file here: lbzip2.zip, which breaks after loop-vectorize with almost any file:

# with llvm release 17 or 18
opt -passes=loop-vectorize -mcpu=znver4 lbzip2.bc -o lbzip2.opt.bc  # <--
clang -O0 lbzip2.opt.bc -o lbzip2
./lbzip2 -z -k compressme
bzip2 -d -t compressme.bz2

> bzip2: ../environment.bz2: data integrity (CRC) error in data
> You can use the `bzip2recover' program to attempt to recover
> data from undamaged sections of corrupted files.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions