-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Closed
Description
Starting at abeecee onwards, after a successful build, make testall gets stuck at test set Profile. This test set has run for up to an hour without completing. A Ctrl-C interrupt exits from make testall. If the Profile test set is run by itself, make test-Profile, the console become unresponsive, even to Ctrl-C, requiring a kill <pid> from a separate terminal. The issue was bisected from a known good make testall at 3db036e. The first bad commit and test failure was at abeecee. The latest HEAD, 219dc10, continues to have this issue.
test summary
julia> versioninfo()
Julia Version 1.10.0-DEV.1147
Commit abeecee71c (2023-04-28 20:59 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.6.0)
CPU: 10 × Apple M1 Max
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
Threads: 5 on 8 virtual cores
Environment:
JULIA_NUM_THREADS = 4
JULIA_DEPOT_PATH = /Users/foo/applications/juliabuild
% gmake testall 19:25
Whitespace check found no issues.
cp /Users/foo/applications/julia/usr/lib/julia/sys.dylib /Users/foo/applications/julia/local.dylib
/Users/foo/applications/julia/usr/bin/julia -J /Users/foo/applications/julia/local.dylib -e 'true'
rm /Users/foo/applications/julia/local.dylib
JULIA test/all
Running parallel tests with:
nworkers() = 8
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 41.985 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
.
.
.
Profile (4) | started at 2023-04-28T19:36:28.703
.
.
.
Statistics | 810 810 30.2s
FileWatching | 562 562 1m46.3s
Pkg | 3232 1 3233 12m59.7s
Profile | 1 1
precompile | 1 1
SharedArrays | 1 1
threads | 1 1
Distributed | 1 1
stress | 1 1
FAILURE
The global RNG seed was 0x7510a9e4d9ccb5a01256c78602917ca8.
Error in testset Profile:
Interrupted
Error in testset precompile:
Interrupted
Error in testset SharedArrays:
Interrupted
Error in testset threads:
Interrupted
Error in testset Distributed:
Interrupted
Error in testset stress:
Interrupted
ERROR: LoadError: Test run finished with errors
in expression starting at /Users/foo/applications/julia/test/runtests.jl:93
┌ Warning: Forcibly interrupting busy workers
│ exception = rmprocs: pids [4] not terminated after 5.0 seconds.
└ @ Distributed ~/applications/julia/usr/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1253
┌ Warning: rmprocs: process 1 not removed
└ @ Distributed ~/applications/julia/usr/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1049
gmake[1]: *** [Makefile:25: all] Error 1
gmake: *** [Makefile:611: testall] Error 2
% gmake test-Profile 20:35
Whitespace check found no issues.
JULIA test/Profile
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 40.115 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
Profile (1) | started at 2023-04-28T20:40:16.513
gmake: *** [Makefile:618: test-Profile] Killed: 9
% gmake test-precompile 20:57
Whitespace check found no issues.
JULIA test/precompile
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 39.769 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
precompile (1) | started at 2023-04-28T20:58:16.664
precompile (1) | 28.84 | 0.06 | 0.2 | 679.60 | 425.94
Test Summary: | Pass Total Time
Overall | 239 239 0.4s
SUCCESS
% gmake test-SharedArrays 20:58
Whitespace check found no issues.
JULIA test/SharedArrays
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 39.765 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
SharedArrays(1) | started at 2023-04-28T20:59:27.009
SharedArrays(1) | 12.06 | 0.26 | 2.2 | 1240.84 | 443.19
Test Summary: | Pass Total Time
Overall | 114 114 0.4s
SUCCESS
% gmake test-threads 20:59
Whitespace check found no issues.
JULIA test/threads
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 39.756 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
threads (1) | started at 2023-04-28T21:00:03.588
Test Summary: | Pass Total Time
threads_exec.jl with JULIA_NUM_THREADS == 1 | 40347 40347 18.3s
Test Summary: | Pass Total Time
threads_exec.jl with JULIA_NUM_THREADS == 2 | 50249 50249 14.2s
Test Summary: | Pass Total Time
threads_exec.jl with JULIA_NUM_THREADS == 4 | 50290 50290 14.0s
Test Summary: | Pass Total Time
threads_exec.jl with JULIA_NUM_THREADS == 4 | 49700 49700 14.0s
threads (1) | 100.38 | 0.00 | 0.0 | 31.94 | 310.11
Test Summary: | Pass Total Time
Overall | 40 40 0.4s
SUCCESS
% gmake test-Distributed 21:01
Whitespace check found no issues.
JULIA test/Distributed
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 39.577 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
Distributed(1) | started at 2023-04-28T21:13:24.643
┌ Warning: Skipping a thread-unsafe test because `Threads.maxthreadid() > 1`
│ Threads.maxthreadid() = 5
└ @ Main ~/applications/julia/usr/share/julia/stdlib/v1.10/Distributed/test/distributed_exec.jl:161
┌ Warning: Skipping a thread-unsafe test because `Threads.maxthreadid() > 1`
│ Threads.maxthreadid() = 5
└ @ Main ~/applications/julia/usr/share/julia/stdlib/v1.10/Distributed/test/distributed_exec.jl:161
[ Info: SO_REUSEPORT is unsupported, skipping reuseport tests
From worker 28:
From worker 28: [69000] signal (15): Terminated: 15
From worker 28: in expression starting at none:0
From worker 28: __psynch_cvwait at /usr/lib/system/libsystem_kernel.dylib (unknown line)
From worker 28: unknown function (ip: 0x0)
From worker 28: __psynch_cvwait at /usr/lib/system/libsystem_kernel.dylib (unknown line)
From worker 28: unknown function (ip: 0x0)
From worker 28: __psynch_cvwait at /usr/lib/system/libsystem_kernel.dylib (unknown line)
From worker 28: unknown function (ip: 0x0)
From worker 28: kevent at /usr/lib/system/libsystem_kernel.dylib (unknown line)
From worker 28: unknown function (ip: 0x0)
From worker 28: Allocations: 2548794 (Pool: 2546673; Big: 2121); GC: 4
┌ Warning: rmprocs: process 1 not removed
└ @ Distributed ~/applications/julia/usr/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1049
Distributed(1) | 120.69 | 0.00 | 0.0 | 27.86 | 317.78
Test Summary: | Pass Total Time
Overall | 15 15 0.4s
SUCCESS
% gmake test-stress 21:15
Whitespace check found no issues.
JULIA test/stress
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 39.331 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
stress (1) | started at 2023-04-28T21:22:00.424
(libuv) kqueue(): Too many open files
stress (1) | 8.26 | 0.00 | 0.0 | 31.85 | 307.91
Test Summary: | Pass Total Time
Overall | 118 118 0.4s
SUCCESS
bisect summary
abeecee71c202f06c454963890e41c1f73522a3e is the first bad commit
commit abeecee71c202f06c454963890e41c1f73522a3e
Author: Diogo Netto <61364108+d-netto@users.noreply.github.com>
Date: Fri Apr 28 17:59:33 2023 -0300
Implement parallel marking (#48600)
Using a work-stealing queue after Chase and Lev, optimized for
weak memory models by Le et al.
Default number of GC threads is half the number of compute threads.
Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>
Co-authored-by: Valentin Churavy <v.churavy@gmail.com>
NEWS.md | 4 +
base/options.jl | 1 +
base/threadingconstructs.jl | 7 +
doc/man/julia.1 | 5 +
doc/src/base/multi-threading.md | 1 +
doc/src/manual/command-line-interface.md | 1 +
doc/src/manual/environment-variables.md | 8 +
doc/src/manual/multi-threading.md | 9 +
src/Makefile | 2 +-
src/gc-debug.c | 64 +++---
src/gc.c | 381 +++++++++++++++++++++++--------
src/gc.h | 37 +--
src/init.c | 1 +
src/jl_exported_data.inc | 1 +
src/jloptions.c | 11 +
src/jloptions.h | 1 +
src/julia.h | 1 +
src/julia_threads.h | 10 +-
src/options.h | 3 +
src/partr.c | 33 ++-
src/threading.c | 42 +++-
src/threading.h | 1 +
src/work-stealing-queue.h | 102 +++++++++
stdlib/Distributed/src/cluster.jl | 5 +-
test/choosetests.jl | 2 +-
test/cmdlineargs.jl | 18 ++
test/gc.jl | 18 ++
test/gc/binarytree.jl | 53 +++++
test/gc/linkedlist.jl | 21 ++
test/gc/objarray.jl | 35 +++
30 files changed, 722 insertions(+), 156 deletions(-)
create mode 100644 src/work-stealing-queue.h
create mode 100644 test/gc.jl
create mode 100644 test/gc/binarytree.jl
create mode 100644 test/gc/linkedlist.jl
create mode 100644 test/gc/objarray.jl
% gmake test-Profile 12:56
Whitespace check found no issues.
JULIA test/Profile
Running parallel tests with:
nworkers() = 1
nthreads() = 4
Sys.CPU_THREADS = 8
Sys.total_memory() = 64.000 GiB
Sys.free_memory() = 44.274 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
Profile (1) | started at 2023-04-29T12:56:52.860
gmake: *** [Makefile:618: test-Profile] Killed: 9
Metadata
Metadata
Assignees
Labels
No labels