Skip to content

Garbage collection thread safety issues on 1.11 #56871

Closed
@MilesCranmer

Description

@MilesCranmer

I think the garbage collection might have some thread safety issues on 1.11?

After running into various GC issues with SymbolicRegression.jl and PySR (reported in other issues #56735 #56759), I've been trying to break the GC with more minimal examples in the hopes of generating fixes.
Here is one such example I found, that results in the GC completely freezing.

The idea is to spawn multiple tasks that allocate and occasionally trigger GC.
This pushes the GC into running sweeps concurrently and potentially reveals data races in the parts of the code that modify the allocation map or page metadata.

@sync for t in 1:100
    Threads.@spawn begin
        # Each thread/task does a bunch of allocations
        for i in 1:10000
            # Allocate arrays
            A = Vector{Any}(undef, 1000)
            # Occasionally force a GC collection
            if (i % 1000) == 0
                GC.gc()
            end
        end
    end
end

If you run this loop ~2-3 times in a REPL, you should hit a freeze.

I wonder if this is from alloc map and page metadata being accessed by multiple GC threads (??). Or maybe it's just from the malloc memory leak identified in the other thread.
I didn't see obvious locking here, so I think there might be thread races? If multiple collector threads modify metadata concurrently (like changing page states), this may cause corruption.

Now, the good news is that this seems to be fixed on nightly. I'm not sure what the issue is from. Maybe someone can point me to an issue I missed that is now fixed. (Though it doesn't seem to be fixed yet on the release-1.11 branch)

In any case, it might be useful to have some of these tests in the CI, so such issues will not show up again?

Metadata

Metadata

Assignees

Labels

GCGarbage collectormultithreadingBase.Threads and related functionalityregression 1.11Regression in the 1.11 release

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions