Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ extractor giving multiple compilation errors when trying to compile the linux kernel #16908

Open
thatjiaozi opened this issue Jul 4, 2024 · 9 comments
Labels
C++ question Further information is requested

Comments

@thatjiaozi
Copy link

Description of the issue

I noticed that several files of the linux kernel source were missing when creatiung a database with code ql using the kernel config attached to this issue and the following command:

codeql database create ~/codeql_db/linux/linux_db --language c --command "make -j`nproc`"

One of the missing files (kernel/bpf/verifier.c) is being correctly compiled by the kernel build system but it seems there are multiple compilation errors when the cpp extractor runs (I attached to complete log to this issue), for example:

E 15:08:35 2293019] Warning[extractor-c++]: In construct_text_message: "./arch/x86/include/asm/processor.h", line 525: error: expected a ")"

 this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0);

Please note:

  1. I have verified the subsystem where that this file belongs to (CONFIG_BPF) is enabled and the feature works when running the kernel
  2. I have attempted to create a database in multiple computers, some of them with a fresh install of debian, and the issue always reproduces
  3. verifier.c is just an example, there are many many other missing files due to similar reasons
  4. The code compiles just fine for the kernel
  5. I compiled an unmodified checkout of https://github.com/torvalds/linux

I believe this might be a bug in the extractor similar to #16901 and #13994. If that is so, please feel free to close this as duplicate.

Let me know if you need anything else from me to reproduce this
extractor_log.txt
config.txt

@thatjiaozi thatjiaozi added the question Further information is requested label Jul 4, 2024
@jketema
Copy link
Contributor

jketema commented Jul 4, 2024

Hi,

Thanks for the report. Would you be able to check that a preprocessed copy of verifier.c has the same problem, and if so could you share the preprocessed file? That would make it significantly easier to potentially reproduce and fix this.

@thatjiaozi
Copy link
Author

Here it is.

FWIW the compilation errors are actually not on verifier.c but on other header files like (./include/linux/topology.h, ./arch/x86/include/asm/tlbflush.h", )

Also please note that for this pre processed file I had to re attach the #include statements otherwise the compilation process was broken for some reason...

Also as I have said, this issue seems to happen quite a bit, If I grep for the string "Extractor exiting with code 1", which seems to indicate error, I have >2000 log files that match.

verifier.preprocessed.txt

@jketema
Copy link
Contributor

jketema commented Jul 4, 2024

FWIW the compilation errors are actually not on verifier.c but on other header files like

That's immaterial, as the preprocessing should pull in those headers.

Also please note that for this pre processed file I had to re attach the #include statements otherwise the compilation process was broken for some reason...

Unfortunately this means this will not be helpful for reproducing the problem.

@jketema
Copy link
Contributor

jketema commented Jul 4, 2024

Also please note that for this pre processed file I had to re attach the #include statements otherwise the compilation process was broken for some reason...

How did you preprocess the file? You will still need to pass all flags that are passed during normal compilation of the file, and not just -E. Did you pass all those flags?

@thatjiaozi
Copy link
Author

It was generated with the kernel build system make kernel/bpf/verifier.i

@jketema
Copy link
Contributor

jketema commented Jul 5, 2024

Would you mind re-running manually to see if that gives better results? Assuming you've done a complete build already, go to the directory that has the subdirectory kernel/bpf with verifier.c, and run something along the lines of:

/usr/libexec/gcc/x86_64-linux-gnu/13/cc1 -quiet -nostdinc -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi -imultiarch x86_64-linux-gnu -D __KERNEL__ -D 'KBUILD_MODFILE="kernel/bpf/verifier"' -D 'KBUILD_BASENAME="verifier"' -D 'KBUILD_MODNAME="verifier"' -D __KBUILD_MODNAME=kmod_verifier -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -MMD kernel/bpf/.verifier.o.d kernel/bpf/verifier.c -quiet -dumpdir kernel/bpf/ -dumpbase verifier.c -dumpbase-ext .c -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel -mindirect-branch=thunk-extern -mindirect-branch-register -mindirect-branch-cs-prefix -mfunction-return=thunk-extern -march=x86-64 -g -gdwarf-4 -O2 -Wall -Wundef -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Werror=strict-prototypes -Wno-format-security -Wno-trigraphs -Wno-frame-address -Wno-address-of-packed-member -Wmissing-declarations -Wmissing-prototypes -Wframe-larger-than=2048 -Wno-main -Wdangling-pointer=0 -Wvla -Wno-pointer-sign -Wcast-function-type -Wstringop-overflow=0 -Warray-bounds=0 -Walloc-size-larger-than=18446744073709551615EiB -Wimplicit-fallthrough=5 -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wenum-conversion -Wextra -Wunused -Wno-unused-but-set-variable -Wunused-const-variable=0 -Wno-packed-not-aligned -Wformat-overflow=0 -Wformat-truncation=0 -Wno-stringop-truncation -Wno-override-init -Wno-missing-field-initializers -Wno-type-limits -Wno-shift-negative-value -Wno-maybe-uninitialized -Wno-sign-compare -Wno-unused-parameter -std=gnu11 -fmacro-prefix-map=./= -fshort-wchar -funsigned-char -fno-common -fno-PIE -fno-strict-aliasing -fcf-protection=branch -falign-jumps=1 -falign-loops=1 -fno-asynchronous-unwind-tables -fno-jump-tables -fpatchable-function-entry=16,16 -fno-delete-null-pointer-checks -fno-allow-store-data-races -fstack-protector -fomit-frame-pointer -fno-stack-clash-protection -falign-functions=16 -fstrict-flex-arrays=3 -fno-strict-overflow -fstack-check=no -fconserve-stack -fsanitize-coverage=trace-pc -E -o preproc.i

@jketema jketema added the C++ label Jul 5, 2024
@jketema
Copy link
Contributor

jketema commented Jul 8, 2024

The problem is caused by:

https://github.com/torvalds/linux/blob/256abd8e550ce977b728be79a74e1729438b4948/arch/x86/include/asm/percpu.h#L44-L48

__seg_gs and __seg_fs are currently not supported by our C/C++ frontend. As a workaround while building a database, you can temporarily change the code to:

#ifdef CONFIG_X86_64
#define __percpu_seg_override	
#else
#define __percpu_seg_override	
#endif

I've reported the problem to our frontend provider.

@eldstal
Copy link

eldstal commented Jul 21, 2024

I can confirm, the suggested workaround works wonders; CodeQL can now analyze the linux kernel.

Thanks for helping out!

@thatjiaozi
Copy link
Author

I also confirm that this seems to fix most of the issues :) There are a few compilation errors left but they seem to be unrelated. For now most of the kernel seems indexed on my end!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants