Skip to content

"stream is closed or unusable" caused by multithreading race-condition (pipe closed in another thread) #47759

Open
@NHDaly

Description

@NHDaly

We recently added a background-logging task to our server, but in some unit tests, we were seeing that very rarely the background task would fail with the following stack trace:

      nested task error: IOError: stream is closed or unusable
      Stacktrace:
        [1] check_open
          @ ./stream.jl:388 [inlined]
        [2] uv_write_async(s::Base.PipeEndpoint, p::Ptr{UInt8}, n::UInt64)
          @ Base ./stream.jl:1072
        [3] uv_write(s::Base.PipeEndpoint, p::Ptr{UInt8}, n::UInt64)
          @ Base ./stream.jl:1035
        [4] unsafe_write(s::Base.PipeEndpoint, p::Ptr{UInt8}, n::UInt64)
          @ Base ./stream.jl:1118
        [5] unsafe_write
          @ ./io.jl:683 [inlined]
        [6] write(s::Base.PipeEndpoint, a::Vector{UInt8})
          @ Base ./io.jl:706
        [7] handle_message(logger::Logging.ConsoleLogger, level::Base.CoreLogging.LogLevel, message::Any, _module::Any, group::Any, id::Any, filepath::Any, line::Any; kwargs::Base.Pairs{Symbol, V, Tuple{Vararg{Symbol, N}}, NamedTuple{names, T}} where {V, N, names, T<:Tuple{Vararg{Any, N}}})
          @ Logging /nix/store/60wrl8qkdrr2l6p9nbv8xff88a8igr0n-julia-1.8.2-patched/share/julia/stdlib/v1.8/Logging/src/ConsoleLogger.jl:178
[...]

We eventually tracked this down to a call to @suppress_err in the unit test, from Suppressors.jl, which temporarily redirects stderr to a pipe, and then puts it back and closes the pipe when the macro is done.

We believe this (rare) error is a race condition between the test's Task closing the pipe at the end of the macro, and the background logging task attempting to write to the pipe. That is, we think that one thread is redirecting stderr and closing the pipe inbetween these two calls:
checking that stderr isn't closed:

if !(isopen(stream)::Bool)
stream = stderr
end

and then writing to it here:
write(stream, take!(buf))

(Here's that whole section, for posterity):

if !(isopen(stream)::Bool)
stream = stderr
end
dsize = displaysize(stream)::Tuple{Int,Int}
nkwargs = length(kwargs)::Int
if nkwargs > hasmaxlog
valbuf = IOBuffer()
rows_per_value = max(1, dsize[1] ÷ (nkwargs + 1 - hasmaxlog))
valio = IOContext(IOContext(valbuf, stream),
:displaysize => (rows_per_value, dsize[2] - 5),
:limit => logger.show_limited)
for (key, val) in kwargs
key === :maxlog && continue
showvalue(valio, val)
vallines = split(String(take!(valbuf)), '\n')
if length(vallines) == 1
push!(msglines, (indent=2, msg=SubString("$key = $(vallines[1])")))
else
push!(msglines, (indent=2, msg=SubString("$key =")))
append!(msglines, ((indent=3, msg=line) for line in vallines))
end
end
end
# Format lines as text with appropriate indentation and with a box
# decoration on the left.
color, prefix, suffix = logger.meta_formatter(level, _module, group, id, filepath, line)::Tuple{Union{Symbol,Int},String,String}
minsuffixpad = 2
buf = IOBuffer()
iob = IOContext(buf, stream)
nonpadwidth = 2 + (isempty(prefix) || length(msglines) > 1 ? 0 : length(prefix)+1) +
msglines[end].indent + termlength(msglines[end].msg) +
(isempty(suffix) ? 0 : length(suffix)+minsuffixpad)
justify_width = min(logger.right_justify, dsize[2])
if nonpadwidth > justify_width && !isempty(suffix)
push!(msglines, (indent=0, msg=SubString("")))
minsuffixpad = 0
nonpadwidth = 2 + length(suffix)
end
for (i, (indent, msg)) in enumerate(msglines)
boxstr = length(msglines) == 1 ? "[ " :
i == 1 ? "" :
i < length(msglines) ? "" :
""
printstyled(iob, boxstr, bold=true, color=color)
if i == 1 && !isempty(prefix)
printstyled(iob, prefix, " ", bold=true, color=color)
end
print(iob, " "^indent, msg)
if i == length(msglines) && !isempty(suffix)
npad = max(0, justify_width - nonpadwidth) + minsuffixpad
print(iob, " "^npad)
printstyled(iob, suffix, color=:light_black)
end
println(iob)
end
write(stream, take!(buf))

We're not sure what the right solution here should be.

It's possible that throwing an exception is already the right behavior: in our case, the exception stacktrace helped lead us to the issue!

On the other hand, philosophically, it seems that Julia takes the approach that an error during logging shouldn't throw an exception, so that you don't have to fear adding log messages. Additionally, it seems that after #40423, the logger is taking pains to gracefully fall back to stderr if the logger's stream has been closed. In this case, the stream was closed after we checked (in a data race), so it's possible we can be more robust in our stream-handling if this safety really is our aim.

For example, we think that this could be fixed by adding a try/catch around the write, and falling back to stderr.


Separately, we are going to file/fix the race condition in Suppressors.jl, probably by adding a delay in between redirecting stderr back to the original stderr, and closing the pipe.

CC: @dewilson, @vilterp

Metadata

Metadata

Assignees

No one assigned

    Labels

    loggingThe logging framework

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions