-
-
Couldn't load subscription status.
- Fork 5.7k
Description
In df39cee (#51764), the definition of pointer was changed from
pointer(x::AbstractArray{T}) where {T} = unsafe_convert(Ptr{T}, x)to the new
pointer(x::AbstractArray{T}) where {T} = unsafe_convert(Ptr{T}, cconvert(Ptr{T}, x))This new function is invalid and can easily cause UB, even when both implementer and caller individually do nothing unsafe. To see why, consider this sentence from the docstring of cconvert
The result of this function should be kept valid (for the GC) until the result of unsafe_convert is not needed anymore.
Similarly, the documentation of unsafe_convert states:
Be careful to ensure that a Julia reference to x [i.e. the input argument] exists as long as the result of this function will be used.
The unsafe prefix on this function indicates that using the result of this function after the x argument to this function is no longer accessible to the program may cause undefined behavior, including program corruption or segfaults, at any later time.
These safety requirements are broken for the new pointer method.
Typically, cconvert will create some kind of heap-allocated, GC-tracked object, which can then be GC.@protected, then converted to a Ptr using unsafe_convert. For example, we have cconvert(::Type{Ptr{UInt8}}, s::AbstractString) = String(s).
You can use it like this:
function foo(s::AbstractString)
cval = Base.cconvert(Ptr{UInt8}, s)
GC.@protect cval begin
ptr = Base.unsafe_convert(Ptr{UInt8}, cval)
@ccall some_func(ptr ...The problem with the new, generic method is that the result of cconvert is not, and cannot be, GC preserved before it's converted to a pointer. So, the returned pointer is immediately invalid, and using it is UB according to #54099.
This is not purely hypothetical - this results in miscompilations today - the example below is a result of a miscompilation due to this invalid pointer method:
julia> using InlineStrings
julia> s = InlineString("abcde");
julia> findfirst(==(UInt8('c')), codeunits(s)) === nothing
trueThis bug is the same as this reduced example:
julia> struct MyVec <: AbstractVector{UInt8} end
struct MyWrapper x::UInt64 end
Base.cconvert(::Type{Ptr{UInt8}}, ::MyVec) = Ref(MyWrapper(0x0102030405060708))
Base.unsafe_convert(::Type{Ptr{UInt8}}, x::Ref{MyWrapper}) = Ptr{UInt8}(pointer_from_objref(x))
function foo(v::MyVec)
GC.@preserve v begin
ptr = pointer(v)
r = @ccall memchr(ptr::Ptr{Cchar}, 0x04::Cchar, UInt(8)::Csize_t)::Ptr{Nothing}
r === C_NULL ? nothing : (r - ptr + 1) % Int
end
end;
julia> foo(MyVec()) === nothing
trueNote that both the definition of cconvert and unsafe_convert is valid and correct, but these two definitions cause pointer to be UB. The generic definition of pointer(::CodeUnits{T, <:AbstractString}) is similarly UB by default, but happens to produce working code anyway on my computer.
This issue has been split off from a discussion in #54002 , because it is a separate issue.