Skip to content

segmentation fault with multi-threading #219

Open
@lmiq

Description

@lmiq

I have this MWE, where I get segmentation faults (frequently, but not deterministically), when trying to run some script that uses multi-threading on the Julia side.

I have used before launching ipython3:

export JULIA_NUM_THREADS=4

(my computer has 4 cores - 8 threads).

The MWE is:

Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.31.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from juliacall import Main as jl

In [2]: import numpy as np

In [3]: jl.seval("""
   ...: function test(x)
   ...:     partial = zeros(Threads.nthreads())
   ...:     Threads.@threads for i in 1:Threads.nthreads()
   ...:         for j in i:Threads.nthreads():length(x)
   ...:             partial[i] += x[j]
   ...:         end
   ...:     end
   ...:     return sum(partial)
   ...: end
   ...: """)
Out[3]: test (generic function with 1 method)

In [4]: x = np.random.random((10_000,))

In [5]: %timeit jl.test(x)
72.2 µs ± 35.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit jl.test(x)
Segmentation fault (core dumped)

Here I have emulated the error using the %timeit macro from ipython, but my actual error I get after some runs of a function of my package:

In [1]: from juliacall import Main as jl

In [2]: jl.seval("using CellListMap")

In [3]: import numpy as np

In [5]: x = np.random.random((10_000,3))

In [6]: nb = jl.neighborlist(x.transpose(), 0.05)

In [7]: nb = jl.neighborlist(x.transpose(), 0.05)

In [8]: nb = jl.neighborlist(x.transpose(), 0.05)

In [9]: nb = jl.neighborlist(x.transpose(), 0.05)

In [10]: nb = jl.neighborlist(x.transpose(), 0.05)

In [11]: nb = jl.neighborlist(x.transpose(), 0.05)

In [12]: nb = jl.neighborlist(x.transpose(), 0.05)

In [13]: nb = jl.neighborlist(x.transpose(), 0.05)

In [14]: nb = jl.neighborlist(x.transpose(), 0.05)

In [15]: nb = jl.neighborlist(x.transpose(), 0.05)

In [16]: nb = jl.neighborlist(x.transpose(), 0.05)

In [17]: nb = jl.neighborlist(x.transpose(), 0.05)

In [18]: nb = jl.neighborlist(x.transpose(), 0.05)

In [19]: nb = jl.neighborlist(x.transpose(), 0.05)
Segmentation fault (core dumped)

%timeit runs the function multiple times, there seems to be some memory corruption, or memory overflow, causing the error.

Anyway, even if you have only some hint on how to debug this, I will be very thankful.

(even in the simplest example above, the segfaults only occur with multi-threading).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions