Open
Description
I have this MWE, where I get segmentation faults (frequently, but not deterministically), when trying to run some script that uses multi-threading on the Julia side.
I have used before launching ipython3
:
export JULIA_NUM_THREADS=4
(my computer has 4 cores - 8 threads).
The MWE is:
Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.31.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from juliacall import Main as jl
In [2]: import numpy as np
In [3]: jl.seval("""
...: function test(x)
...: partial = zeros(Threads.nthreads())
...: Threads.@threads for i in 1:Threads.nthreads()
...: for j in i:Threads.nthreads():length(x)
...: partial[i] += x[j]
...: end
...: end
...: return sum(partial)
...: end
...: """)
Out[3]: test (generic function with 1 method)
In [4]: x = np.random.random((10_000,))
In [5]: %timeit jl.test(x)
72.2 µs ± 35.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [6]: %timeit jl.test(x)
Segmentation fault (core dumped)
Here I have emulated the error using the %timeit
macro from ipython
, but my actual error I get after some runs of a function of my package:
In [1]: from juliacall import Main as jl
In [2]: jl.seval("using CellListMap")
In [3]: import numpy as np
In [5]: x = np.random.random((10_000,3))
In [6]: nb = jl.neighborlist(x.transpose(), 0.05)
In [7]: nb = jl.neighborlist(x.transpose(), 0.05)
In [8]: nb = jl.neighborlist(x.transpose(), 0.05)
In [9]: nb = jl.neighborlist(x.transpose(), 0.05)
In [10]: nb = jl.neighborlist(x.transpose(), 0.05)
In [11]: nb = jl.neighborlist(x.transpose(), 0.05)
In [12]: nb = jl.neighborlist(x.transpose(), 0.05)
In [13]: nb = jl.neighborlist(x.transpose(), 0.05)
In [14]: nb = jl.neighborlist(x.transpose(), 0.05)
In [15]: nb = jl.neighborlist(x.transpose(), 0.05)
In [16]: nb = jl.neighborlist(x.transpose(), 0.05)
In [17]: nb = jl.neighborlist(x.transpose(), 0.05)
In [18]: nb = jl.neighborlist(x.transpose(), 0.05)
In [19]: nb = jl.neighborlist(x.transpose(), 0.05)
Segmentation fault (core dumped)
%timeit
runs the function multiple times, there seems to be some memory corruption, or memory overflow, causing the error.
Anyway, even if you have only some hint on how to debug this, I will be very thankful.
(even in the simplest example above, the segfaults only occur with multi-threading).
Metadata
Metadata
Assignees
Labels
No labels