-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducible segmentation fault during build on linux/arm64 #228
Comments
This is the C code where it crashes: size_t world = jl_atomic_load_acquire(&jl_world_counter);
ct->world_age = world;
if (!has_defs && jl_get_module_infer(m) != 0) {
(void)jl_type_infer(mfunc, world, 0);
}
result = jl_invoke(/*func*/NULL, /*args*/NULL, /*nargs*/0, mfunc); // crashes
ct->world_age = last_age; https://github.com/JuliaLang/julia/blob/36034abf26062acad4af9dcec7c4fc53b260dbb4/src/toplevel.c#L897 |
The last PR to change this line where it segfaulted was JuliaLang/julia#31984. @vtjnash @JeffBezanson any advice for how I could debug this? Or is this line unrelated? |
We are trying to call into the JIT there, and so perhaps LLVM is computing the jump address incorrectly? The stacktrace is not quite precisely clear enough what that value is that it crashed on. LLVM is planning some fixes for that for AARCH64 in JITLink in the upcoming release though. |
Thanks. Should I raise an issue on the main Julia repo or LLVM? Here's a minimal dockerfile which gives the same error: FROM julia:1.8.2
RUN julia -e 'using Pkg; Pkg.add("Conda"); Pkg.build("Conda")' Another interesting clue is that I can actually build this just fine on my ARM-based laptop (M1). It's only when I try to build the To reproduce this with GitHub actions, you could either build this locally on an x86_64 system, using Alternatively, you can create a GitHub action. First, create a Dockerfile in the root directory containing the above. Then, create a workflow file: name: Docker test
on:
push:
branches:
- "**"
jobs:
docker:
runs-on: ubuntu-latest
strategy:
matrix:
arch: [linux/amd64, linux/arm64]
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build and push
uses: docker/build-push-action@v3
with:
context: .
platforms: ${{ matrix.arch }}
push: false (This could be combined with https://github.com/csexton/debugger-action to interact with it after failure.) |
The equivalent issue for M1 was fixed for arm64-darwin in the previous (old) release of LLVM, so that would make sense, so you would likely need to get a version of LLVM master working with Julia master before reporting it. |
I experience the same Segfault when simply precompiling the @vtjnash can you point to further issues which could help solving this? |
I'm trying to build docker images for PySR (which is built on PyJulia), and the arm64 jobs fail consistently because of a segmentation fault when building Conda.jl. The
amd64
jobs are fine.Here's the traceback:
Here's the job result, the dockerfile, and the action file. This same error occurs every time I run the job.
ubuntu-latest
python:latest
(platform=linux/arm64
)The line it's getting a segfault on in build.jl:
Conda.jl/deps/build.jl
Line 106 in 8f71332
Any idea what this is? @mkitti would you happen to know?
The text was updated successfully, but these errors were encountered: