Description
Unfortunately, no MWE yet but I'm opening this issue early to share what I have. The basic problem is that I have a large project that 50% of the time when precompiling in a particular workflow results in a silent cancellation of precompilation.
When this happens, Pkg leaves behind an incomplete progress bar + spinners:
Progress [===================> ] 134/293
◒ MLStyle
◑ Parsers
Otherwise, no error is printed to the terminal. The only other sign that something went wrong is that you'll often fall immediately into (undesirable) serial pre-compilation, etc.
Stack trace from a (slightly modified) build of 1.10:
systemerror(p::String, errno::Int32; extrainfo::Nothing) at error.jl:176,
kwcall(::@NamedTuple{extrainfo::Nothing}, ::typeof(systemerror), p::String, errno::Int32) at error.jl:176,
kwcall(::@NamedTuple{extrainfo::Nothing}, ::typeof(systemerror), p::String) at error.jl:176,
#systemerror#88 at error.jl:175 [inlined],
systemerror at error.jl:175 [inlined],
open(fname::String; lock::Bool, read::Bool, write::Nothing, create::Nothing, truncate::Nothing, append::Nothing) at iostream.jl:293,
open at iostream.jl:275 [inlined],
open(fname::String, mode::String; lock::Bool) at iostream.jl:356,
open at iostream.jl:355 [inlined],
stale_cachefile(modkey::Base.PkgId, build_id::UInt128, modpath::String, cachefile::String; ignore_loaded::Bool) at loading.jl:3008,
stale_cachefile at loading.jl:3007 [inlined],
#stale_cachefile#984 at loading.jl:3005 [inlined],
stale_cachefile at loading.jl:3004 [inlined],
isprecompiled(pkg::Base.PkgId; ignore_loaded::Bool, stale_cache::Dict{Tuple{Base.PkgId, UInt128, String, String}, Bool}, cachepaths::Vector{String}, sourcepath::String) at loading.jl:1397,
isprecompiled at loading.jl:1389 [inlined],
(::Pkg.API.var"#247#285"{Bool, Bool, Pkg.Types.Context, Vector{Task}, IOStream, Dict{Base.PkgId, String}, Dict{Base.PkgId, String}, Base.Event, Base.Event, ReentrantLock, Vector{Base.PkgId}, Vector{Base.PkgId}, Dict{Base.PkgId, String}, Vector{Base.PkgId}, Vector{Base.PkgId}, Dict{Base.PkgId, Bool}, Dict{Base.PkgId, Base.Event}, Dict{Base.PkgId, Bool}, Vector{Pkg.Types.PackageSpec}, Dict{Base.PkgId, String}, Dict{Tuple{Base.PkgId, UInt128, String, String}, Bool}, Vector{Base.PkgId}, Pkg.API.var"#color_string#258"{Bool}, Bool, Bool, Base.TTY, Base.Semaphore, Bool, String, Vector{String}, Vector{Base.PkgId}, Base.PkgId})() at API.jl:1503
This shows that SystemError is a ENOENT
from this open
: https://github.com/JuliaLang/julia/blob/4954197196d657d14edd3e9c61ac101866e6fa25/base/loading.jl#L3008
I think this suggests several problems:
- Base loading should probably not have an unguarded open like this (pretty much ever, I think -
open
can essentially always fail...) Pkg.precompile
does not know the difference between a user interrupt vs. a failed assertion / internal error, so it silently discards this internal error assuming it has been "interrupted"- (The actual bug) It seems there is maybe a race condition and/or caching misbehavior causing the file not to be where it is expected
I'm also worried this issue is quite common, but just hard to notice so that we don't get bug reports...
The 5 people on my team that work on this (or other similarly large) projects have all hit this issue. I actually hit this for 2+ months before I was informed the spinners aren't supposed to just die like that 😅