Skip to content

Conversation

benlorenz
Copy link
Contributor

Since 1.11.5 we are getting crashes from OpenMP threads of an external library which uses GMP numbers. This did not happen on 1.11.4. Version 1.12 and nightly are also unaffected.

Attaching gdb shows that the task struct is an invalid pointer (but not NULL):

(gdb) p ct
$1 = (jl_task_t *) 0xffffffffffffff90

This value seems to come from the jl_current_task macro

#define jl_current_task (container_of(jl_get_pgcstack(), jl_task_t, gcstack))

which doesn't check whether jl_get_pgcstack() returns NULL for foreign threads. Two lines further down is a check for ct != NULL but this cannot happen since container_of will just subtract a fixed offset from a NULL-pointer.

julia/src/gc.c

Lines 4122 to 4124 in 2d89891

jl_task_t *ct = jl_current_task;
void *data = malloc(sz);
if (data != NULL && ct != NULL && ct->world_age) {

I changed the relevant two lines to use jl_get_current_task() which will properly return NULL if jl_get_pgcstack() returns NULL. This also aligns it with the current master:

julia/src/gc-stock.c

Lines 3741 to 3742 in 16eca6e

jl_task_t *ct = jl_get_current_task();
if (data != NULL && ct != NULL) {

This was probably introduced during the backport in #57880, @gbaraldi.

Note that this PR targets release-1.11, not sure if this is correct but there is no backports-release-1.11 branch right now.

cc: @lgoettgens
x-ref: oscar-system/Oscar.jl#4806

@lgoettgens
Copy link
Contributor

Note that this PR targets release-1.11, not sure if this is correct but there is no backports-release-1.11 branch right now.

Maybe @KristofferC can comment on this

@gbaraldi
Copy link
Member

LGTM branch target aside

@fingolfin fingolfin added the backport 1.11 Change should be backported to release-1.11 label Apr 24, 2025
@fingolfin
Copy link
Member

There doesn't seem to be a backports 1.11.6 branch at this point. I realize 1.12 is coming closer, but it would be really unfortunate to leave 1.11.5 as last 1.11.x release with such a severe regression.

@KristofferC
Copy link
Member

I'll make one soon

@KristofferC KristofferC mentioned this pull request Apr 25, 2025
71 tasks
@KristofferC KristofferC changed the base branch from release-1.11 to backports-release-1.11 April 25, 2025 09:36
@KristofferC KristofferC added the merge me PR is reviewed. Merge when all tests are passing label Apr 25, 2025
@IanButterworth IanButterworth merged commit 3997241 into JuliaLang:backports-release-1.11 Apr 28, 2025
9 checks passed
@benlorenz benlorenz deleted the bl/rel111_mallocnull branch April 28, 2025 22:00
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Apr 28, 2025
@KristofferC KristofferC mentioned this pull request Aug 19, 2025
65 tasks
DilumAluthge added a commit that referenced this pull request Sep 5, 2025
Backported PRs:
- [x] #54840 <!-- Add boundscheck in speccache_eq to avoid OOB access
due to data race -->
- [x] #42080 <!-- recommend explicit `using Foo: Foo, ...` in package
code (was: "using considered harmful") -->
- [x] #58127 <!-- [DOC] Update installation docs: /downloads/ =>
/install/ -->
- [x] #58202 <!-- [release-1.11] malloc: use jl_get_current_task to fix
null check -->
- [x] #58584 <!-- Make `Ptr` values static-show w/ type-information -->
- [x] #58637 <!-- Make late gc lower handle insertelement of alloca use.
-->
- [x] #58837 <!-- fix null comparisons for non-standard address spaces
-->
- [x] #57826 <!-- Add a `similar` method for `Type{<:CodeUnits}` -->
- [x] #58293 <!-- fix trailing indices stackoverflow in reinterpreted
array -->
- [x] #58887 <!-- Pkg: Allow configuring can_fancyprint(io::IO) using
IOContext -->
- [x] #58937 <!-- Fix nthreadpools size in JLOptions -->
- [x] #58978 <!-- Fix precompilepkgs warn loaded setting -->
- [x] #58998 <!-- Bugfix: Use Base.aligned_sizeof instead of sizeof in
Mmap.mmap -->
- [x] #59120 <!-- Fix memory order typo in "src/julia_atomics.h" -->
- [x] #59170 <!-- Clarify and enhance confusing precompile test -->

Need manual backport:
- [ ] #56329 <!-- loading: clean up more concurrency issues -->
- [ ] #56956 <!-- Add "mea culpa" to foreign module assignment error.
-->
- [ ] #57035 <!-- linux: workaround to avoid deadlock inside
dl_iterate_phdr in glibc -->
- [ ] #57089 <!-- Block thread from receiving profile signal with
stackwalk lock -->
- [ ] #57249 <!-- restore non-freebsd-unix fix for profiling -->
- [ ] #58011 <!-- Remove try-finally scope from `@time_imports`
`@trace_compile` `@trace_dispatch` -->
- [ ] #58062 <!-- remove unnecessary edge from `exp_impl` to `pow` -->
- [ ] #58157 <!-- add showing a string to REPL precompile workload -->
- [ ] #58209 <!-- Specialize `one` for the `SizedArray` test helper -->
- [ ] #58108 <!-- Base.get_extension & Dates.format made public -->
- [ ] #58356 <!-- codegen: remove readonly from abstract type calling
convention -->
- [ ] #58415 <!-- [REPL] more reliable extension loading -->
- [ ] #58510 <!-- Don't filter `Core` methods from newly-inferred list
-->
- [ ] #58110 <!-- relax dispatch for the `IteratorSize` method for
`Generator` -->
- [ ] #58965 <!-- Fix `hygienic-scope`s in inner macro expansions -->
- [ ] #58971 <!-- Fix alignment of failed precompile jobs on CI -->
- [ ] #59066 <!-- build: Also pass -fno-strict-aliasing for C++ -->

Contains multiple commits, manual intervention needed:
- [ ] #55877 <!-- fix FileWatching designs and add workaround for a stat
bug on Apple -->
- [ ] #56755 <!-- docs: fix scope type of a `struct` to hard -->
- [ ] #57809 <!-- Fix fptrunc Float64 -> Float16 rounding through
Float32 -->
- [ ] #57398 <!-- Make remaining float intrinsics require float
arguments -->
- [ ] #56351 <!-- Fix `--project=@script` when outside script directory
-->
- [ ] #57129 <!-- clarify that time_ns is monotonic -->
- [ ] #58134 <!-- Note annotated string API is experimental in Julia
1.11 in HISTORY.md -->
- [ ] #58401 <!-- check that hashing of types does not foreigncall
(`jl_type_hash` is concrete evaluated) -->
- [ ] #58435 <!-- Fix layout flags for types that have oddly sized
primitive type fields -->
- [ ] #58483 <!-- Fix tbaa usage when storing into heap allocated
immutable structs -->
- [ ] #58512 <!-- Make more types jl_static_show readably -->
- [ ] #58012 <!-- Re-enable tab completion of kwargs for large method
tables -->
- [ ] #58683 <!-- Add 0 predecessor to entry basic block and handle it
in inlining -->
- [ ] #59112 <!-- Add builtin function name to add methods error -->

Non-merged PRs with backport label:
- [ ] #59329 <!-- aotcompile: destroy LLVM context after serializing
combined module -->
- [ ] #58848 <!-- Set array size only when safe to do so -->
- [ ] #58535 <!-- gf.c: include const-return methods in
`--trace-compile` -->
- [ ] #58038 <!-- strings/cstring: `transcode`: prevent Windows sysimage
invalidation -->
- [ ] #57604 <!-- `@nospecialize` for `string_index_err` -->
- [ ] #57366 <!-- Use ptrdiff_t sized offsets for gvars_offsets to allow
large sysimages -->
- [ ] #56890 <!-- Enable getting non-boxed LLVM type from Julia Type -->
- [ ] #56823 <!-- Make version of opaque closure constructor in world
-->
- [ ] #55958 <!-- also redirect JL_STDERR etc. when redirecting to
devnull -->
- [ ] #55956 <!-- Make threadcall gc safe -->
- [ ] #55534 <!-- Set stdlib sources as read-only during installation
-->
- [ ] #55499 <!-- propagate the terminal's `displaysize` to the
`IOContext` used by the REPL -->
- [ ] #55458 <!-- Allow for generically extracting unannotated string
-->
- [ ] #55457 <!-- Make AnnotateChar equality consider annotations -->
- [ ] #55220 <!-- `isfile_casesensitive` fixes on Windows -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] #50813 <!-- More doctests for Sockets and capitalization fix -->
- [ ] #50157 <!-- improve docs for `@inbounds` and
`Base.@propagate_inbounds` -->

---------

Co-authored-by: Kiran Pamnany <kpamnany@users.noreply.github.com>
Co-authored-by: adienes <51664769+adienes@users.noreply.github.com>
Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>
Co-authored-by: Keno Fischer <keno@juliacomputing.com>
Co-authored-by: Simeon David Schaub <simeon@schaub.rocks>
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
Co-authored-by: Alex Arslan <ararslan@comcast.net>
Co-authored-by: Fons van der Plas <fonsvdplas@gmail.com>
Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>
Co-authored-by: JonasIsensee <jonas.isensee@web.de>
Co-authored-by: Curtis Vogt <curtis.vogt@gmail.com>
Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
Co-authored-by: DilumAluthgeBot <43731525+DilumAluthgeBot@users.noreply.github.com>
Co-authored-by: DilumAluthge <5619885+DilumAluthge@users.noreply.github.com>
@DilumAluthge DilumAluthge mentioned this pull request Sep 9, 2025
59 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.11 Change should be backported to release-1.11
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants