Skip to content

Implement write barrier fastpath for sticky immix #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 16, 2023

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Apr 16, 2023

This PR implements the write barrier fastpath for sticky immix in both the runtime write barrier and the codegen write barrier. There are also a few other changes: 1. pass collection type to MMTk's handle_user_collection_request, 2. call MMTk in jl_gc_notify_image_alloc.

@qinsoon qinsoon force-pushed the mmtk-write-barrier branch from b0e7fa8 to 91dc055 Compare May 9, 2023 05:20
@qinsoon qinsoon changed the title Allow MMTk implement write barrier and immortal/vm space. Implement write barrier fastpath for sticky immix May 16, 2023
@qinsoon qinsoon marked this pull request as ready for review May 16, 2023 04:06
@qinsoon qinsoon merged commit e7e43f1 into mmtk:master May 16, 2023
qinsoon pushed a commit to qinsoon/julia that referenced this pull request May 2, 2024
This is part of the work to address JuliaLang#51352 by attempting to allow the
compiler to perform SRAO on persistent data structures like
`PersistentDict` as if they were regular immutable data structures.
These sorts of data structures have very complicated internals (with
lots of mutation, memory sharing, etc.), but a relatively simple
interface. As such, it is unlikely that our compiler will have
sufficient power to optimize this interface by analyzing the
implementation.

We thus need to come up with some other mechanism that gives the
compiler license to perform the requisite optimization. One way would be
to just hardcode `PersistentDict` into the compiler, optimizing it like
any of the other builtin datatypes. However, this is of course very
unsatisfying. At the other end of the spectrum would be something like a
generic rewrite rule system (e-graphs anyone?) that would let the
PersistentDict implementation declare its interface to the compiler and
the compiler would use this for optimization (in a perfect world, the
actual rewrite would then be checked using some sort of formal methods).
I think that would be interesting, but we're very far from even being
able to design something like that (at least in Base - experiments with
external AbstractInterpreters in this direction are encouraged).

This PR tries to come up with a reasonable middle ground, where the
compiler gets some knowledge of the protocol hardcoded without having to
know about the implementation details of the data structure.

The basic ideas is that `Core` provides some magic generic functions
that implementations can extend. Semantically, they are not special.
They dispatch as usual, and implementations are expected to work
properly even in the absence of any compiler optimizations.

However, the compiler is semantically permitted to perform structural
optimization using these magic generic functions. In the concrete case,
this PR introduces the `KeyValue` interface which consists of two
generic functions, `get` and `set`. The core optimization is that the
compiler is allowed to rewrite any occurrence of `get(set(x, k, v), k)`
into `v` without additional legality checks. In particular, the compiler
performs no type checks, conversions, etc. The higher level
implementation code is expected to do all that.

This approach closely matches the general direction we've been taking in
external AbstractInterpreters for embedding additional semantics and
optimization opportunities into Julia code (although we generally use
methods there, rather than full generic functions), so I think we have
some evidence that this sort of approach works reasonably well.

Nevertheless, this is certainly an experiment and the interface is
explicitly declared unstable.

## Current Status

This is fully working and implemented, but the optimization currently
bails on anything but the simplest cases. Filling all those cases in is
not particularly hard, but should be done along with a more invasive
refactoring of SROA, so we should figure out the general direction here
first and then we can finish all that up in a follow-up cleanup.

## Obligatory benchmark
Before:
```
julia> using BenchmarkTools

julia> function foo()
           a = Base.PersistentDict(:a => 1)
           return a[:a]
       end
foo (generic function with 1 method)

julia> @benchmark foo()
BenchmarkTools.Trial: 10000 samples with 993 evaluations.
 Range (min … max):  32.940 ns …  28.754 μs  ┊ GC (min … max):  0.00% … 99.76%
 Time  (median):     49.647 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   57.519 ns ± 333.275 ns  ┊ GC (mean ± σ):  10.81% ±  2.22%

        ▃█▅               ▁▃▅▅▃▁                ▁▃▂   ▂
  ▁▂▄▃▅▇███▇▃▁▂▁▁▁▁▁▁▁▁▂▂▅██████▅▂▁▁▁▁▁▁▁▁▁▁▂▃▃▇███▇▆███▆▄▃▃▂▂ ▃
  32.9 ns         Histogram: frequency by time         68.6 ns <

 Memory estimate: 128 bytes, allocs estimate: 4.

julia> @code_typed foo()
CodeInfo(
1 ─ %1  = invoke Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}(Base.HashArrayMappedTries.undef::UndefInitializer, 1::Int64)::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}
│   %2  = %new(Base.HashArrayMappedTries.HAMT{Symbol, Int64}, %1, 0x00000000)::Base.HashArrayMappedTries.HAMT{Symbol, Int64}
│   %3  = %new(Base.HashArrayMappedTries.Leaf{Symbol, Int64}, :a, 1)::Base.HashArrayMappedTries.Leaf{Symbol, Int64}
│   %4  = Base.getfield(%2, :data)::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}
│   %5  = $(Expr(:boundscheck, true))::Bool
└──       goto mmtk#5 if not %5
2 ─ %7  = Base.sub_int(1, 1)::Int64
│   %8  = Base.bitcast(UInt64, %7)::UInt64
│   %9  = Base.getfield(%4, :size)::Tuple{Int64}
│   %10 = $(Expr(:boundscheck, true))::Bool
│   %11 = Base.getfield(%9, 1, %10)::Int64
│   %12 = Base.bitcast(UInt64, %11)::UInt64
│   %13 = Base.ult_int(%8, %12)::Bool
└──       goto mmtk#4 if not %13
3 ─       goto mmtk#5
4 ─ %16 = Core.tuple(1)::Tuple{Int64}
│         invoke Base.throw_boundserror(%4::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}, %16::Tuple{Int64})::Union{}
└──       unreachable
5 ┄ %19 = Base.getfield(%4, :ref)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}
│   %20 = Base.memoryref(%19, 1, false)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}
│         Base.memoryrefset!(%20, %3, :not_atomic, false)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}
└──       goto mmtk#6
6 ─ %23 = Base.getfield(%2, :bitmap)::UInt32
│   %24 = Base.or_int(%23, 0x00010000)::UInt32
│         Base.setfield!(%2, :bitmap, %24)::UInt32
└──       goto mmtk#7
7 ─ %27 = %new(Base.PersistentDict{Symbol, Int64}, %2)::Base.PersistentDict{Symbol, Int64}
└──       goto mmtk#8
8 ─ %29 = invoke Base.getindex(%27::Base.PersistentDict{Symbol, Int64}, 🅰️:Symbol)::Int64
└──       return %29
```

After:
```
julia> using BenchmarkTools

julia> function foo()
           a = Base.PersistentDict(:a => 1)
           return a[:a]
       end
foo (generic function with 1 method)

julia> @benchmark foo()
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  2.459 ns … 11.320 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.460 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.469 ns ±  0.183 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂    █                                              ▁    █ ▂
  █▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁█ █
  2.46 ns      Histogram: log(frequency) by time     2.47 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @code_typed foo()
CodeInfo(
1 ─     return 1
```
qinsoon pushed a commit to qinsoon/julia that referenced this pull request May 2, 2024
`@something` eagerly unwraps any `Some` given to it, while keeping the
variable between its arguments the same. This can be an issue if a
previously unpacked value is used as input to `@something`, leading to a
type instability on more than two arguments (e.g. because of a fallback
to `Some(nothing)`). By using different variables for each argument,
type inference has an easier time handling these cases that are isolated
to single branches anyway.

This also adds some comments to the macro, since it's non-obvious what
it does.

Benchmarking the specific case I encountered this in led to a ~2x
performance improvement on multiple machines.

1.10-beta3/master:

```
[sukera@tower 01]$ jl1100 -q --project=. -L 01.jl -e 'bench()'
v"1.10.0-beta3"

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  38.670 μs … 70.350 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     43.340 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   43.395 μs ±  1.518 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              ▆█▂ ▁▁                           
  ▂▂▂▂▂▂▂▂▂▁▂▂▂▃▃▃▂▂▃▃▃▂▂▂▂▂▄▇███▆██▄▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  38.7 μs         Histogram: frequency by time          48 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
```

This PR:

```
[sukera@tower 01]$ julia -q --project=. -L 01.jl -e 'bench()'
v"1.11.0-DEV.970"

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  22.820 μs …  44.980 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     24.300 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.370 μs ± 832.239 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                ▂▅▇██▇▆▅▁                                       
  ▂▂▂▂▂▂▂▂▃▃▄▅▇███████████▅▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂ ▃
  22.8 μs         Histogram: frequency by time         27.7 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
``` 


<details>
<summary>Benchmarking code (spoilers for Advent Of Code 2023 Day 01,
Part 01). Running this requires the input of that Advent Of Code
day.</summary>

```julia
using BenchmarkTools
using InteractiveUtils

isdigit(d::UInt8) = UInt8('0') <= d <= UInt8('9')
someDigit(c::UInt8) = isdigit(c) ? Some(c - UInt8('0')) : nothing

function part1(data)
    total = 0
    may_a = nothing
    may_b = nothing

    for c in data
        digitRes = someDigit(c)
        may_a = @something may_a digitRes Some(nothing)
        may_b = @something digitRes may_b Some(nothing)
        if c == UInt8('\n')
            digit_a = may_a::UInt8
            digit_b = may_b::UInt8
            total += digit_a*0xa + digit_b
            may_a = nothing
            may_b = nothing
        end
    end

    return total
end

function bench()
    data = read("input.txt")
    display(VERSION)
    println()
    display(@benchmark part1($data))
    nothing
end
```
</details>

<details>
<summary>`@code_warntype` before</summary>

```julia
julia> @code_warntype part1(data)
MethodInstance for part1(::Vector{UInt8})
  from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7
Arguments
  #self#::Core.Const(part1)
  data::Vector{UInt8}
Locals
  @_3::Union{Nothing, Tuple{UInt8, Int64}}
  may_b::Union{Nothing, UInt8}
  may_a::Union{Nothing, UInt8}
  total::Int64
  c::UInt8
  digit_b::UInt8
  digit_a::UInt8
  val@_10::Any
  val@_11::Any
  digitRes::Union{Nothing, Some{UInt8}}
  @_13::Union{Some{Nothing}, Some{UInt8}, UInt8}
  @_14::Union{Some{Nothing}, Some{UInt8}}
  @_15::Some{Nothing}
  @_16::Union{Some{Nothing}, Some{UInt8}, UInt8}
  @_17::Union{Some{Nothing}, UInt8}
  @_18::Some{Nothing}
Body::Int64
1 ──       (total = 0)
│          (may_a = Main.nothing)
│          (may_b = Main.nothing)
│    %4  = data::Vector{UInt8}
│          (@_3 = Base.iterate(%4))
│    %6  = (@_3 === nothing)::Bool
│    %7  = Base.not_int(%6)::Bool
└───       goto mmtk#24 if not %7
2 ┄─       Core.NewvarNode(:(digit_b))
│          Core.NewvarNode(:(digit_a))
│          Core.NewvarNode(:(val@_10))
│    %12 = @_3::Tuple{UInt8, Int64}
│          (c = Core.getfield(%12, 1))
│    %14 = Core.getfield(%12, 2)::Int64
│          (digitRes = Main.someDigit(c))
│          (val@_11 = may_a)
│    %17 = (val@_11::Union{Nothing, UInt8} !== Base.nothing)::Bool
└───       goto mmtk#4 if not %17
3 ──       (@_13 = val@_11::UInt8)
└───       goto mmtk#11
4 ──       (val@_11 = digitRes)
│    %22 = (val@_11::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool
└───       goto mmtk#6 if not %22
5 ──       (@_14 = val@_11::Some{UInt8})
└───       goto mmtk#10
6 ──       (val@_11 = Main.Some(Main.nothing))
│    %27 = (val@_11::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true)
└───       goto mmtk#8 if not %27
7 ──       (@_15 = val@_11::Core.Const(Some(nothing)))
└───       goto mmtk#9
8 ──       Core.Const(:(@_15 = Base.nothing))
9 ┄─       (@_14 = @_15)
10 ┄       (@_13 = @_14)
11 ┄ %34 = @_13::Union{Some{Nothing}, Some{UInt8}, UInt8}
│          (may_a = Base.something(%34))
│          (val@_10 = digitRes)
│    %37 = (val@_10::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool
└───       goto mmtk#13 if not %37
12 ─       (@_16 = val@_10::Some{UInt8})
└───       goto mmtk#20
13 ─       (val@_10 = may_b)
│    %42 = (val@_10::Union{Nothing, UInt8} !== Base.nothing)::Bool
└───       goto mmtk#15 if not %42
14 ─       (@_17 = val@_10::UInt8)
└───       goto mmtk#19
15 ─       (val@_10 = Main.Some(Main.nothing))
│    %47 = (val@_10::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true)
└───       goto mmtk#17 if not %47
16 ─       (@_18 = val@_10::Core.Const(Some(nothing)))
└───       goto mmtk#18
17 ─       Core.Const(:(@_18 = Base.nothing))
18 ┄       (@_17 = @_18)
19 ┄       (@_16 = @_17)
20 ┄ %54 = @_16::Union{Some{Nothing}, Some{UInt8}, UInt8}
│          (may_b = Base.something(%54))
│    %56 = c::UInt8
│    %57 = Main.UInt8('\n')::Core.Const(0x0a)
│    %58 = (%56 == %57)::Bool
└───       goto mmtk#22 if not %58
21 ─       (digit_a = Core.typeassert(may_a, Main.UInt8))
│          (digit_b = Core.typeassert(may_b, Main.UInt8))
│    %62 = total::Int64
│    %63 = (digit_a * 0x0a)::UInt8
│    %64 = (%63 + digit_b)::UInt8
│          (total = %62 + %64)
│          (may_a = Main.nothing)
└───       (may_b = Main.nothing)
22 ┄       (@_3 = Base.iterate(%4, %14))
│    %69 = (@_3 === nothing)::Bool
│    %70 = Base.not_int(%69)::Bool
└───       goto mmtk#24 if not %70
23 ─       goto mmtk#2
24 ┄       return total
```
</details>

<details>
<summary>`@code_native debuginfo=:none` Before </summary>

```julia
julia> @code_native debuginfo=:none part1(data)
	.text
	.file	"part1"
	.globl	julia_part1_418                 # -- Begin function julia_part1_418
	.p2align	4, 0x90
	.type	julia_part1_418,@function
julia_part1_418:                        # @julia_part1_418
# %bb.0:                                # %top
	push	rbp
	mov	rbp, rsp
	push	r15
	push	r14
	push	r13
	push	r12
	push	rbx
	sub	rsp, 40
	mov	rax, qword ptr [rdi + 8]
	test	rax, rax
	je	.LBB0_1
# %bb.2:                                # %L17
	mov	rcx, qword ptr [rdi]
	dec	rax
	mov	r10b, 1
	xor	r14d, r14d
                                        # implicit-def: $r12b
                                        # implicit-def: $r13b
                                        # implicit-def: $r9b
                                        # implicit-def: $sil
	mov	qword ptr [rbp - 64], rax       # 8-byte Spill
	mov	al, 1
	mov	dword ptr [rbp - 48], eax       # 4-byte Spill
                                        # implicit-def: $al
                                        # kill: killed $al
	xor	eax, eax
	mov	qword ptr [rbp - 56], rax       # 8-byte Spill
	mov	qword ptr [rbp - 72], rcx       # 8-byte Spill
                                        # implicit-def: $cl
	jmp	.LBB0_3
	.p2align	4, 0x90
.LBB0_8:                                #   in Loop: Header=BB0_3 Depth=1
	mov	dword ptr [rbp - 48], 0         # 4-byte Folded Spill
.LBB0_24:                               # %post_union_move
                                        #   in Loop: Header=BB0_3 Depth=1
	movzx	r13d, byte ptr [rbp - 41]       # 1-byte Folded Reload
	mov	r12d, r8d
	cmp	qword ptr [rbp - 64], r14       # 8-byte Folded Reload
	je	.LBB0_13
.LBB0_25:                               # %guard_exit113
                                        #   in Loop: Header=BB0_3 Depth=1
	inc	r14
	mov	r10d, ebx
.LBB0_3:                                # %L19
                                        # =>This Inner Loop Header: Depth=1
	mov	rax, qword ptr [rbp - 72]       # 8-byte Reload
	xor	ebx, ebx
	xor	edi, edi
	movzx	r15d, r9b
	movzx	ecx, cl
	movzx	esi, sil
	mov	r11b, 1
                                        # implicit-def: $r9b
	movzx	edx, byte ptr [rax + r14]
	lea	eax, [rdx - 58]
	lea	r8d, [rdx - 48]
	cmp	al, -10
	setae	bl
	setb	dil
	test	r10b, 1
	cmovne	r15d, edi
	mov	edi, 0
	cmovne	ecx, ebx
	mov	bl, 1
	cmovne	esi, edi
	test	r15b, 1
	jne	.LBB0_7
# %bb.4:                                # %L76
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	r11b, 2
	test	cl, 1
	jne	.LBB0_5
# %bb.6:                                # %L78
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	ebx, r10d
	mov	r9d, r15d
	mov	byte ptr [rbp - 41], r13b       # 1-byte Spill
	test	sil, 1
	je	.LBB0_26
.LBB0_7:                                # %L82
                                        #   in Loop: Header=BB0_3 Depth=1
	cmp	al, -11
	jbe	.LBB0_9
	jmp	.LBB0_8
	.p2align	4, 0x90
.LBB0_5:                                #   in Loop: Header=BB0_3 Depth=1
	mov	ecx, r8d
	mov	sil, 1
	xor	ebx, ebx
	mov	byte ptr [rbp - 41], r8b        # 1-byte Spill
	xor	r9d, r9d
	xor	ecx, ecx
	cmp	al, -11
	ja	.LBB0_8
.LBB0_9:                                # %L90
                                        #   in Loop: Header=BB0_3 Depth=1
	test	byte ptr [rbp - 48], 1          # 1-byte Folded Reload
	jne	.LBB0_23
# %bb.10:                               # %L115
                                        #   in Loop: Header=BB0_3 Depth=1
	cmp	dl, 10
	jne	.LBB0_11
# %bb.14:                               # %L122
                                        #   in Loop: Header=BB0_3 Depth=1
	test	r15b, 1
	jne	.LBB0_15
# %bb.12:                               # %L130.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	movzx	eax, byte ptr [rbp - 41]        # 1-byte Folded Reload
	mov	bl, 1
	add	eax, eax
	lea	eax, [rax + 4*rax]
	add	al, r12b
	movzx	eax, al
	add	qword ptr [rbp - 56], rax       # 8-byte Folded Spill
	mov	al, 1
	mov	dword ptr [rbp - 48], eax       # 4-byte Spill
	cmp	qword ptr [rbp - 64], r14       # 8-byte Folded Reload
	jne	.LBB0_25
	jmp	.LBB0_13
	.p2align	4, 0x90
.LBB0_23:                               # %L115.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	al, 1
                                        # implicit-def: $r8b
	mov	dword ptr [rbp - 48], eax       # 4-byte Spill
	cmp	dl, 10
	jne	.LBB0_24
	jmp	.LBB0_21
.LBB0_11:                               #   in Loop: Header=BB0_3 Depth=1
	mov	r8d, r12d
	jmp	.LBB0_24
.LBB0_1:
	xor	eax, eax
	mov	qword ptr [rbp - 56], rax       # 8-byte Spill
.LBB0_13:                               # %L159
	mov	rax, qword ptr [rbp - 56]       # 8-byte Reload
	add	rsp, 40
	pop	rbx
	pop	r12
	pop	r13
	pop	r14
	pop	r15
	pop	rbp
	ret
.LBB0_21:                               # %L122.thread
	test	r15b, 1
	jne	.LBB0_15
# %bb.22:                               # %post_box_union58
	movabs	rdi, offset .L_j_str1
	movabs	rax, offset ijl_type_error
	movabs	rsi, 140008511215408
	movabs	rdx, 140008667209736
	call	rax
.LBB0_15:                               # %fail
	cmp	r11b, 1
	je	.LBB0_19
# %bb.16:                               # %fail
	movzx	eax, r11b
	cmp	eax, 2
	jne	.LBB0_17
# %bb.20:                               # %box_union54
	movzx	eax, byte ptr [rbp - 41]        # 1-byte Folded Reload
	movabs	rcx, offset jl_boxed_uint8_cache
	mov	rdx, qword ptr [rcx + 8*rax]
	jmp	.LBB0_18
.LBB0_26:                               # %L80
	movabs	rax, offset ijl_throw
	movabs	rdi, 140008495049392
	call	rax
.LBB0_19:                               # %box_union
	movabs	rdx, 140008667209736
	jmp	.LBB0_18
.LBB0_17:
	xor	edx, edx
.LBB0_18:                               # %post_box_union
	movabs	rdi, offset .L_j_str1
	movabs	rax, offset ijl_type_error
	movabs	rsi, 140008511215408
	call	rax
.Lfunc_end0:
	.size	julia_part1_418, .Lfunc_end0-julia_part1_418
                                        # -- End function
	.type	.L_j_str1,@object               # @_j_str1
	.section	.rodata.str1.1,"aMS",@progbits,1
.L_j_str1:
	.asciz	"typeassert"
	.size	.L_j_str1, 11

	.section	".note.GNU-stack","",@progbits
```
</details>

<details>
<summary>`@code_warntype` After</summary>

```julia

[sukera@tower 01]$ julia -q --project=. -L 01.jl
julia> data = read("input.txt");

julia> @code_warntype part1(data)
MethodInstance for part1(::Vector{UInt8})
  from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7
Arguments
  #self#::Core.Const(part1)
  data::Vector{UInt8}
Locals
  @_3::Union{Nothing, Tuple{UInt8, Int64}}
  may_b::Union{Nothing, UInt8}
  may_a::Union{Nothing, UInt8}
  total::Int64
  val@_7::Union{}
  val@_8::Union{}
  c::UInt8
  digit_b::UInt8
  digit_a::UInt8
  #JuliaLang#215::Some{Nothing}
  #JuliaLang#216::Union{Nothing, UInt8}
  #JuliaLang#217::Union{Nothing, Some{UInt8}}
  #JuliaLang#212::Some{Nothing}
  #JuliaLang#213::Union{Nothing, Some{UInt8}}
  #JuliaLang#214::Union{Nothing, UInt8}
  digitRes::Union{Nothing, Some{UInt8}}
  @_19::Union{Nothing, UInt8}
  @_20::Union{Nothing, UInt8}
  @_21::Nothing
  @_22::Union{Nothing, UInt8}
  @_23::Union{Nothing, UInt8}
  @_24::Nothing
Body::Int64
1 ──        (total = 0)
│           (may_a = Main.nothing)
│           (may_b = Main.nothing)
│    %4   = data::Vector{UInt8}
│           (@_3 = Base.iterate(%4))
│    %6   = @_3::Union{Nothing, Tuple{UInt8, Int64}}
│    %7   = (%6 === nothing)::Bool
│    %8   = Base.not_int(%7)::Bool
└───        goto mmtk#24 if not %8
2 ┄─        Core.NewvarNode(:(val@_7))
│           Core.NewvarNode(:(val@_8))
│           Core.NewvarNode(:(digit_b))
│           Core.NewvarNode(:(digit_a))
│           Core.NewvarNode(:(#JuliaLang#215))
│           Core.NewvarNode(:(#JuliaLang#216))
│           Core.NewvarNode(:(#JuliaLang#217))
│           Core.NewvarNode(:(#JuliaLang#212))
│           Core.NewvarNode(:(#JuliaLang#213))
│    %19  = @_3::Tuple{UInt8, Int64}
│           (c = Core.getfield(%19, 1))
│    %21  = Core.getfield(%19, 2)::Int64
│    %22  = c::UInt8
│           (digitRes = Main.someDigit(%22))
│    %24  = may_a::Union{Nothing, UInt8}
│           (#JuliaLang#214 = %24)
│    %26  = Base.:!::Core.Const(!)
│    %27  = #JuliaLang#214::Union{Nothing, UInt8}
│    %28  = Base.isnothing(%27)::Bool
│    %29  = (%26)(%28)::Bool
└───        goto mmtk#4 if not %29
3 ── %31  = #JuliaLang#214::UInt8
│           (@_19 = Base.something(%31))
└───        goto mmtk#11
4 ── %34  = digitRes::Union{Nothing, Some{UInt8}}
│           (#JuliaLang#213 = %34)
│    %36  = Base.:!::Core.Const(!)
│    %37  = #JuliaLang#213::Union{Nothing, Some{UInt8}}
│    %38  = Base.isnothing(%37)::Bool
│    %39  = (%36)(%38)::Bool
└───        goto mmtk#6 if not %39
5 ── %41  = #JuliaLang#213::Some{UInt8}
│           (@_20 = Base.something(%41))
└───        goto mmtk#10
6 ── %44  = Main.Some::Core.Const(Some)
│    %45  = Main.nothing::Core.Const(nothing)
│           (#JuliaLang#212 = (%44)(%45))
│    %47  = Base.:!::Core.Const(!)
│    %48  = #JuliaLang#212::Core.Const(Some(nothing))
│    %49  = Base.isnothing(%48)::Core.Const(false)
│    %50  = (%47)(%49)::Core.Const(true)
└───        goto mmtk#8 if not %50
7 ── %52  = #JuliaLang#212::Core.Const(Some(nothing))
│           (@_21 = Base.something(%52))
└───        goto mmtk#9
8 ──        Core.Const(nothing)
│           Core.Const(:(val@_8 = Base.something(Base.nothing)))
│           Core.Const(nothing)
│           Core.Const(:(val@_8))
└───        Core.Const(:(@_21 = %58))
9 ┄─ %60  = @_21::Core.Const(nothing)
└───        (@_20 = %60)
10 ┄ %62  = @_20::Union{Nothing, UInt8}
└───        (@_19 = %62)
11 ┄ %64  = @_19::Union{Nothing, UInt8}
│           (may_a = %64)
│    %66  = digitRes::Union{Nothing, Some{UInt8}}
│           (#JuliaLang#217 = %66)
│    %68  = Base.:!::Core.Const(!)
│    %69  = #JuliaLang#217::Union{Nothing, Some{UInt8}}
│    %70  = Base.isnothing(%69)::Bool
│    %71  = (%68)(%70)::Bool
└───        goto mmtk#13 if not %71
12 ─ %73  = #JuliaLang#217::Some{UInt8}
│           (@_22 = Base.something(%73))
└───        goto mmtk#20
13 ─ %76  = may_b::Union{Nothing, UInt8}
│           (#JuliaLang#216 = %76)
│    %78  = Base.:!::Core.Const(!)
│    %79  = #JuliaLang#216::Union{Nothing, UInt8}
│    %80  = Base.isnothing(%79)::Bool
│    %81  = (%78)(%80)::Bool
└───        goto mmtk#15 if not %81
14 ─ %83  = #JuliaLang#216::UInt8
│           (@_23 = Base.something(%83))
└───        goto mmtk#19
15 ─ %86  = Main.Some::Core.Const(Some)
│    %87  = Main.nothing::Core.Const(nothing)
│           (#JuliaLang#215 = (%86)(%87))
│    %89  = Base.:!::Core.Const(!)
│    %90  = #JuliaLang#215::Core.Const(Some(nothing))
│    %91  = Base.isnothing(%90)::Core.Const(false)
│    %92  = (%89)(%91)::Core.Const(true)
└───        goto mmtk#17 if not %92
16 ─ %94  = #JuliaLang#215::Core.Const(Some(nothing))
│           (@_24 = Base.something(%94))
└───        goto mmtk#18
17 ─        Core.Const(nothing)
│           Core.Const(:(val@_7 = Base.something(Base.nothing)))
│           Core.Const(nothing)
│           Core.Const(:(val@_7))
└───        Core.Const(:(@_24 = %100))
18 ┄ %102 = @_24::Core.Const(nothing)
└───        (@_23 = %102)
19 ┄ %104 = @_23::Union{Nothing, UInt8}
└───        (@_22 = %104)
20 ┄ %106 = @_22::Union{Nothing, UInt8}
│           (may_b = %106)
│    %108 = Main.:(==)::Core.Const(==)
│    %109 = c::UInt8
│    %110 = Main.UInt8('\n')::Core.Const(0x0a)
│    %111 = (%108)(%109, %110)::Bool
└───        goto mmtk#22 if not %111
21 ─ %113 = may_a::Union{Nothing, UInt8}
│           (digit_a = Core.typeassert(%113, Main.UInt8))
│    %115 = may_b::Union{Nothing, UInt8}
│           (digit_b = Core.typeassert(%115, Main.UInt8))
│    %117 = Main.:+::Core.Const(+)
│    %118 = total::Int64
│    %119 = Main.:+::Core.Const(+)
│    %120 = Main.:*::Core.Const(*)
│    %121 = digit_a::UInt8
│    %122 = (%120)(%121, 0x0a)::UInt8
│    %123 = digit_b::UInt8
│    %124 = (%119)(%122, %123)::UInt8
│           (total = (%117)(%118, %124))
│           (may_a = Main.nothing)
└───        (may_b = Main.nothing)
22 ┄        (@_3 = Base.iterate(%4, %21))
│    %129 = @_3::Union{Nothing, Tuple{UInt8, Int64}}
│    %130 = (%129 === nothing)::Bool
│    %131 = Base.not_int(%130)::Bool
└───        goto mmtk#24 if not %131
23 ─        goto mmtk#2
24 ┄ %134 = total::Int64
└───        return %134
```
</details>


<details>
<summary>`@code_native debuginfo=:none` After </summary>

```julia

julia> @code_native debuginfo=:none part1(data)
	.text
	.file	"part1"
	.globl	julia_part1_1203                # -- Begin function julia_part1_1203
	.p2align	4, 0x90
	.type	julia_part1_1203,@function
julia_part1_1203:                       # @julia_part1_1203
; Function Signature: part1(Array{UInt8, 1})
# %bb.0:                                # %top
	#DEBUG_VALUE: part1:data <- [DW_OP_deref] $rdi
	push	rbp
	mov	rbp, rsp
	push	r15
	push	r14
	push	r13
	push	r12
	push	rbx
	sub	rsp, 40
	vxorps	xmm0, xmm0, xmm0
	#APP
	mov	rax, qword ptr fs:[0]
	#NO_APP
	lea	rdx, [rbp - 64]
	vmovaps	xmmword ptr [rbp - 64], xmm0
	mov	qword ptr [rbp - 48], 0
	mov	rcx, qword ptr [rax - 8]
	mov	qword ptr [rbp - 64], 4
	mov	rax, qword ptr [rcx]
	mov	qword ptr [rbp - 72], rcx       # 8-byte Spill
	mov	qword ptr [rbp - 56], rax
	mov	qword ptr [rcx], rdx
	#DEBUG_VALUE: part1:data <- [DW_OP_deref] 0
	mov	r15, qword ptr [rdi + 16]
	test	r15, r15
	je	.LBB0_1
# %bb.2:                                # %L34
	mov	r14, qword ptr [rdi]
	dec	r15
	mov	r11b, 1
	mov	r13b, 1
                                        # implicit-def: $r12b
                                        # implicit-def: $r10b
	xor	eax, eax
	jmp	.LBB0_3
	.p2align	4, 0x90
.LBB0_4:                                #   in Loop: Header=BB0_3 Depth=1
	xor	r11d, r11d
	mov	ebx, edi
	mov	r10d, r8d
.LBB0_9:                                # %L114
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	r12d, esi
	test	r15, r15
	je	.LBB0_12
.LBB0_10:                               # %guard_exit126
                                        #   in Loop: Header=BB0_3 Depth=1
	inc	r14
	dec	r15
	mov	r13d, ebx
.LBB0_3:                                # %L36
                                        # =>This Inner Loop Header: Depth=1
	movzx	edx, byte ptr [r14]
	test	r13b, 1
	movzx	edi, r13b
	mov	ebx, 1
	mov	ecx, 0
	cmove	ebx, edi
	cmovne	edi, ecx
	movzx	ecx, r10b
	lea	esi, [rdx - 48]
	lea	r9d, [rdx - 58]
	movzx	r8d, sil
	cmove	r8d, ecx
	cmp	r9b, -11
	ja	.LBB0_4
# %bb.5:                                # %L89
                                        #   in Loop: Header=BB0_3 Depth=1
	test	r11b, 1
	jne	.LBB0_8
# %bb.6:                                # %L102
                                        #   in Loop: Header=BB0_3 Depth=1
	cmp	dl, 10
	jne	.LBB0_7
# %bb.13:                               # %L106
                                        #   in Loop: Header=BB0_3 Depth=1
	test	r13b, 1
	jne	.LBB0_14
# %bb.11:                               # %L114.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	add	ecx, ecx
	mov	bl, 1
	mov	r11b, 1
	lea	ecx, [rcx + 4*rcx]
	add	cl, r12b
	movzx	ecx, cl
	add	rax, rcx
	test	r15, r15
	jne	.LBB0_10
	jmp	.LBB0_12
	.p2align	4, 0x90
.LBB0_8:                                # %L102.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	r11b, 1
                                        # implicit-def: $sil
	cmp	dl, 10
	jne	.LBB0_9
	jmp	.LBB0_15
.LBB0_7:                                #   in Loop: Header=BB0_3 Depth=1
	mov	esi, r12d
	jmp	.LBB0_9
.LBB0_1:
	xor	eax, eax
.LBB0_12:                               # %L154
	mov	rcx, qword ptr [rbp - 56]
	mov	rdx, qword ptr [rbp - 72]       # 8-byte Reload
	mov	qword ptr [rdx], rcx
	add	rsp, 40
	pop	rbx
	pop	r12
	pop	r13
	pop	r14
	pop	r15
	pop	rbp
	ret
.LBB0_15:                               # %L106.thread
	test	r13b, 1
	jne	.LBB0_14
# %bb.16:                               # %post_box_union47
	movabs	rax, offset jl_nothing
	movabs	rcx, offset jl_small_typeof
	movabs	rdi, offset ".L_j_str_typeassert#1"
	mov	rdx, qword ptr [rax]
	mov	rsi, qword ptr [rcx + 336]
	movabs	rax, offset ijl_type_error
	mov	qword ptr [rbp - 48], rsi
	call	rax
.LBB0_14:                               # %post_box_union
	movabs	rax, offset jl_nothing
	movabs	rcx, offset jl_small_typeof
	movabs	rdi, offset ".L_j_str_typeassert#1"
	mov	rdx, qword ptr [rax]
	mov	rsi, qword ptr [rcx + 336]
	movabs	rax, offset ijl_type_error
	mov	qword ptr [rbp - 48], rsi
	call	rax
.Lfunc_end0:
	.size	julia_part1_1203, .Lfunc_end0-julia_part1_1203
                                        # -- End function
	.type	".L_j_str_typeassert#1",@object # @"_j_str_typeassert#1"
	.section	.rodata.str1.1,"aMS",@progbits,1
".L_j_str_typeassert#1":
	.asciz	"typeassert"
	.size	".L_j_str_typeassert#1", 11

	.section	".note.GNU-stack","",@progbits
```
</details>

Co-authored-by: Sukera <Seelengrab@users.noreply.github.com>
udesou pushed a commit to udesou/julia that referenced this pull request Oct 16, 2024
E.g. this allows `finalizer` inlining in the following case:
```julia
mutable struct ForeignBuffer{T}
    const ptr::Ptr{T}
end
const foreign_buffer_finalized = Ref(false)
function foreign_alloc(::Type{T}, length) where T
    ptr = Libc.malloc(sizeof(T) * length)
    ptr = Base.unsafe_convert(Ptr{T}, ptr)
    obj = ForeignBuffer{T}(ptr)
    return finalizer(obj) do obj
        Base.@assume_effects :notaskstate :nothrow
        foreign_buffer_finalized[] = true
        Libc.free(obj.ptr)
    end
end
function f_EA_finalizer(N::Int)
    workspace = foreign_alloc(Float64, N)
    GC.@preserve workspace begin
        (;ptr) = workspace
        Base.@assume_effects :nothrow @noinline println(devnull, "ptr = ", ptr)
    end
end
```
```julia
julia> @code_typed f_EA_finalizer(42)
CodeInfo(
1 ── %1  = Base.mul_int(8, N)::Int64
│    %2  = Core.lshr_int(%1, 63)::Int64
│    %3  = Core.trunc_int(Core.UInt8, %2)::UInt8
│    %4  = Core.eq_int(%3, 0x01)::Bool
└───       goto mmtk#3 if not %4
2 ──       invoke Core.throw_inexacterror(:convert::Symbol, UInt64::Type, %1::Int64)::Union{}
└───       unreachable
3 ──       goto mmtk#4
4 ── %9  = Core.bitcast(Core.UInt64, %1)::UInt64
└───       goto mmtk#5
5 ──       goto mmtk#6
6 ──       goto mmtk#7
7 ──       goto mmtk#8
8 ── %14 = $(Expr(:foreigncall, :(:malloc), Ptr{Nothing}, svec(UInt64), 0, :(:ccall), :(%9), :(%9)))::Ptr{Nothing}
└───       goto mmtk#9
9 ── %16 = Base.bitcast(Ptr{Float64}, %14)::Ptr{Float64}
│    %17 = %new(ForeignBuffer{Float64}, %16)::ForeignBuffer{Float64}
└───       goto mmtk#10
10 ─ %19 = $(Expr(:gc_preserve_begin, :(%17)))
│    %20 = Base.getfield(%17, :ptr)::Ptr{Float64}
│          invoke Main.println(Main.devnull::Base.DevNull, "ptr = "::String, %20::Ptr{Float64})::Nothing
│          $(Expr(:gc_preserve_end, :(%19)))
│    %23 = Main.foreign_buffer_finalized::Base.RefValue{Bool}
│          Base.setfield!(%23, :x, true)::Bool
│    %25 = Base.getfield(%17, :ptr)::Ptr{Float64}
│    %26 = Base.bitcast(Ptr{Nothing}, %25)::Ptr{Nothing}
│          $(Expr(:foreigncall, :(:free), Nothing, svec(Ptr{Nothing}), 0, :(:ccall), :(%26), :(%25)))::Nothing
└───       return nothing
) => Nothing
```

However, this is still a WIP. Before merging, I want to improve EA's
precision a bit and at least fix the test case that is currently marked
as `broken`. I also need to check its impact on compiler performance.

Additionally, I believe this feature is not yet practical. In
particular, there is still significant room for improvement in the
following areas:
- EA's interprocedural capabilities: currently EA is performed ad-hoc
for limited frames because of latency reasons, which significantly
reduces its precision in the presence of interprocedural calls.
- Relaxing the `:nothrow` check for finalizer inlining: the current
algorithm requires `:nothrow`-ness on all paths from the allocation of
the mutable struct to its last use, which is not practical for
real-world cases. Even when `:nothrow` cannot be guaranteed, auxiliary
optimizations such as inserting a `finalize` call after the last use
might still be possible (JuliaLang#55990).
udesou pushed a commit to udesou/julia that referenced this pull request Jul 29, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified
by tsan:

```
./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end'
==================
WARNING: ThreadSanitizer: data race (pid=5575)
  Write of size 4 at 0xffff9bf9bd28 by thread T9:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    mmtk#2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    mmtk#3 <null> <null> (0xffff76a21164)
    mmtk#4 <null> <null> (0xffff76a1f074)
    mmtk#5 <null> <null> (0xffff76a1f0c4)
    mmtk#6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    mmtk#7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    mmtk#8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    mmtk#9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Previous write of size 4 at 0xffff9bf9bd28 by thread T10:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    mmtk#2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    mmtk#3 <null> <null> (0xffff76a21164)
    mmtk#4 <null> <null> (0xffff76a1f074)
    mmtk#5 <null> <null> (0xffff76a1f0c4)
    mmtk#6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    mmtk#7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    mmtk#8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    mmtk#9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28)
```
udesou pushed a commit to udesou/julia that referenced this pull request Jul 29, 2025
Simplify `workqueue_for`. While not strictly necessary, the acquire load
in `getindex(once::OncePerThread{T,F}, tid::Integer)` makes
ThreadSanitizer happy. With the existing implementation, we get false
positives whenever a thread other than the one that originally allocated
the array reads it:

```
==================
WARNING: ThreadSanitizer: data race (pid=6819)
  Atomic read of size 8 at 0xffff86bec058 by main thread:
    #0 getproperty Base_compiler.jl:57 (sys.so+0x113b478)
    #1 julia_pushNOT._1925 task.jl:868 (sys.so+0x113b478)
    mmtk#2 julia_enq_work_1896 task.jl:969 (sys.so+0x5cd218)
    mmtk#3 schedule task.jl:983 (sys.so+0x892294)
    mmtk#4 macro expansion threadingconstructs.jl:522 (sys.so+0x892294)
    mmtk#5 julia_start_profile_listener_60681 Base.jl:355 (sys.so+0x892294)
    mmtk#6 julia___init___60641 Base.jl:392 (sys.so+0x1178dc)
    mmtk#7 jfptr___init___60642 <null> (sys.so+0x118134)
    mmtk#8 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
    mmtk#9 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
    mmtk#10 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0xbba74)
    mmtk#11 jl_module_run_initializer /home/user/c/julia/src/toplevel.c:68:13 (libjulia-internal.so.1.13+0xbba74)
    mmtk#12 _finish_jl_init_ /home/user/c/julia/src/init.c:632:13 (libjulia-internal.so.1.13+0x9c0fc)
    mmtk#13 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
    mmtk#14 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
    mmtk#15 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
    mmtk#16 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)

  Previous write of size 8 at 0xffff86bec058 by thread T2:
    #0 IntrusiveLinkedListSynchronized task.jl:863 (sys.so+0x78d220)
    #1 macro expansion task.jl:932 (sys.so+0x78d220)
    mmtk#2 macro expansion lock.jl:376 (sys.so+0x78d220)
    mmtk#3 julia_workqueue_for_1933 task.jl:924 (sys.so+0x78d220)
    mmtk#4 julia_wait_2048 task.jl:1204 (sys.so+0x6255ac)
    mmtk#5 julia_task_done_hook_49205 task.jl:839 (sys.so+0x128fdc0)
    mmtk#6 jfptr_task_done_hook_49206 <null> (sys.so+0x902218)
    mmtk#7 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
    mmtk#8 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
    mmtk#9 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9c79c)
    mmtk#10 jl_finish_task /home/user/c/julia/src/task.c:345:13 (libjulia-internal.so.1.13+0x9c79c)
    mmtk#11 jl_threadfun /home/user/c/julia/src/scheduler.c:122:5 (libjulia-internal.so.1.13+0xe7db8)

  Thread T2 (tid=6824, running) created by main thread at:
    #0 pthread_create <null> (julia+0x85f88)
    #1 uv_thread_create_ex /workspace/srcdir/libuv/src/unix/thread.c:172 (libjulia-internal.so.1.13+0x1a8d70)
    mmtk#2 _finish_jl_init_ /home/user/c/julia/src/init.c:618:5 (libjulia-internal.so.1.13+0x9c010)
    mmtk#3 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
    mmtk#4 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
    mmtk#5 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
    mmtk#6 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)

SUMMARY: ThreadSanitizer: data race Base_compiler.jl:57 in getproperty
==================
```
udesou added a commit that referenced this pull request Aug 7, 2025
* Increment state conditionally in `CartesianIndices` iteration (#58742)

Fixes https://github.com/JuliaLang/julia/issues/53430

```julia
julia> a = rand(100,100); b = similar(a); av = view(a, axes(a)...); bv = view(b, axes(b)...); bv2 = view(b, UnitRange.(axes(b))...);

julia> @btime copyto!($bv2, $av); # slow, indices are UnitRanges
  12.352 μs (0 allocations: 0 bytes) # master, v"1.13.0-DEV.745"
  1.662 μs (0 allocations: 0 bytes) # this PR
  
julia> @btime copyto!($bv, $av); # reference
  1.733 μs (0 allocations: 0 bytes)
```
The performances become comparable after this PR.

I've also renamed the second `I` to `Itail`, as the two variables
represent different quantities.

* 🤖 [master] Bump the Distributed stdlib from 51e5297 to 3679026 (#58748)

Stdlib: Distributed
URL: https://github.com/JuliaLang/Distributed.jl
Stdlib branch: master
Julia branch: master
Old commit: 51e5297
New commit: 3679026
Julia version: 1.13.0-DEV
Distributed version: 1.11.0(Does not match)
Bump invoked by: @DilumAluthge
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
https://github.com/JuliaLang/Distributed.jl/compare/51e52978481835413d15b589919aba80dd85f890...3679026d7b510befdedfa8c6497e3cb032f9cea1

```
$ git log --oneline 51e5297..3679026
3679026 Merge pull request #137 from JuliaLang/dpa/dont-use-link-local
875cd5a Rewrite the code to be a bit more explicit
2a6ee53 Non-link-local IP4 > non-link-local IP6 > link-local IP4 > link-local IP6
c0e9eb4 Factor functionality out into separate `choose_bind_addr()` function
86cbb8a Add explanation
0b7288c Worker: Bind to the first non-link-local IPv4 address
ff8689a Merge pull request #131 from JuliaLang/spawnat-docs
ba3c843 Document that `@spawnat :any` doesn't do load-balancing
```

Co-authored-by: DilumAluthge <5619885+DilumAluthge@users.noreply.github.com>

* devdocs: contributing: fix headings (#58749)

In particular, it seems like Documenter takes the level-one heading to
define the page title. So the page titles were missing in the TOC before
this change.

* Work around LLVM JITLink stack overflow issue. (#58579)

The JITLinker recurses for every symbol in the list so limit the size of
the list

This is kind of ugly. Also 1000 might be too large, we don't want to go
too small because that wastes memory and 1000 was fine locally for the
things I tested.

Fixes https://github.com/JuliaLang/julia/issues/58229

* bump Compiler.jl version to 0.1.1 (#58744)

As the latest version of BaseCompiler.jl will be bumped to v0.1.1 after
JuliaRegistries/General#132990.

* REPL: fix typo and potential `UndefVarError` (#58761)

Detected by the new LS diagnostics:)

* fix fallback code path in `take!(::IOBuffer)` method (#58762)

JET told me that the `data` local variable was inparticular is undefined
at this point.
After reviewing this code, I think this code path is unreachable
actually since `bytesavailable(io::IOBuffer)` returns `0` when `io` has
been closed. So it's probably better to make it clear.

* Fix multi-threading docs typo (#58770)

* help bounds checking to be eliminated for `getindex(::Memory, ::Int)` (#58754)

Second try for PR #58741.

This moves the `getindex(::Memory, ::Int)` bounds check to Julia, which
is how it's already done for `getindex(::Array, ::Int)`, so I guess it's
correct.

Also deduplicate the bounds checking code while at it.

* Define textwidth for overlong chars (#58602)

Previously, this would error. There is no guarantee of how terminals
render overlong encodings. Some terminals does not print them at all,
and some print "�". Here, we set a textwidth of 1, conservatively.

Refs #58593

* Add MethodError hints for functions in other modules (#58715)

When a MethodError occurs, check if functions with the same name exist
in other modules (particularly those of the argument types). This helps
users discover that they may need to import a function or ensure
multiple
functions are the same generic function.

- For Base functions: suggests importing (e.g., "You may have intended
to import Base.length")
- For other modules: suggests they may be intended as the same generic
function
- Shows all matches from relevant modules in sorted order
- Uses modulesof! to properly handle all type structures including
unions

Fixes #58682

* Fix markdown bullet list in variables-and-scoping.md (#58771)

* CONTRIBUTING.md: Ask folks to disclose AI-written PRs (#58666)

* Convert julia-repl blocks to jldoctest format (#58594)

Convert appropriate julia-repl code blocks to jldoctest format to enable
automatic testing. In addition, this introduces a new `nodoctest =
"reason"`
pattern to annotate code blocks that are deliberate not doctested, so
future
readers will know not to try.

Many code blocks are converted, in particular:

- Manual pages: arrays.md, asynchronous-programming.md, functions.md,
  integers-and-floating-point-numbers.md, metaprogramming.md,
  multi-threading.md, performance-tips.md, variables.md,
  variables-and-scoping.md
- Base documentation: abstractarray.jl, bitarray.jl, expr.jl, file.jl,
  float.jl, iddict.jl, path.jl, scopedvalues.md, sort.md
- Standard library: Dates/conversions.jl, Random/RNGs.jl,
  Sockets/addrinfo.jl

Key changes:
- Add filters for non-deterministic output (timing, paths, memory
addresses)
- Add setup/teardown for filesystem operations
- Fix parentmodule(M) usage in expr.jl for doctest compatibility
- Document double escaping requirement for regex filters in docstrings
- Update AGENTS.md with test running instructions

Note: Some julia-repl blocks were intentionally left unchanged when they
demonstrate language internals subject to change or contain
non-deterministic output that cannot be properly filtered.

Refs #56921

---------

Co-authored-by: Keno Fischer <Keno@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>

* adds the `nth` function for iterables (#56580)

Hi,

I've turned the open ended issue #54454 into an actual PR.
Tangentially related to #10092 ?

This PR introduces the `nth(itr, n)` function to iterators to give a
`getindex` type of behaviour.
I've tried my best to optimize as much as possible by specializing on
different types of iterators.
In the spirit of iterators any OOB access returns `nothing`. (edit:
instead of throwing an error, i.e. `first(itr, n)` and `last(itr, n)`)

here is the comparison of running the testsuite (~22 different
iterators) using generic `nth` and specialized `nth`:
```julia
@btime begin                                                                                                                                                                                                                     
    for (itr, n, _) in $testset                                                                                                                                                                                           
         _fallback_nth(itr, n)                                                                                                                                                                                                           
    end                                                                                                                                                                                                                          
end                                                                                                                                                                                                                              
117.750 μs (366 allocations: 17.88 KiB)

@btime begin                                                                                                                                                                                                                     
  for (itr, n, _) in $testset                                                                                                                                                                                           
    nth(itr, n)                                                                                                                                                                                                              
  end                                                                                                                                                                                                                          
end                                                                                                                                                                                                                              
24.250 μs (341 allocations: 16.70 KiB)
```

---------

Co-authored-by: adienes <51664769+adienes@users.noreply.github.com>
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
Co-authored-by: Dilum Aluthge <dilum@aluthge.com>

* refine IR model queries (#58661)

- `jl_isa_ast_node` was missing `enter`/`leave` nodes.
 - `Core.IR` exports mistakenly included a function `memoryref`.
 - `Base.IR`, and `quoted` were not public or documented.
 - Add julia function `isa_ast_node` to improve accuracy of `quoted`.
- Change `==` on AST nodes to check egal equality of any constants in
the IR / AST, and make hashing consistent with that change. This
helpfully allows determining that `x + 1` and `x + 1.0` are not
equivalent, exchangeable operations. If you need to compare any two
objects for semantic equality, you may need to first wrap them with `x =
Base.isa_ast_node(x) ? x : QuoteNode(x)` to resolve the ambiguity of
whether the comparison is of the semantics or value.
 - Handle `undef` fields in Phi/PhiC node equality and hashing

* fix showing types after removing using Core (#58773)

PR #57357 changed the default using list, but only changed some of the
places where the `show` code handled that. This led to duplicate
(confusing) printing, since both Core. and Base. prefixes are dropped.

Fix #58772

* inform compiler about local variable definedness (#58778)

JET's new analysis pass now detects local variables that may be
undefined, which has revealed such issues in several functions within
Base (JuliaLang/julia#58762).

This commit addresses local variables whose definedness the compiler
cannot properly determine, primarily in functions reachable from JET's
test suite. No functional changes are made.

* better effects for `iterate` for `Memory` and `Array` (#58755)

* Test: Hide REPL internals in backtraces (#58732)

* Update docs for various type predicates (#58774)

Makes the description for `isdispatchtuple` accurate, adds a docstring
for `iskindtype` and `isconcretedispatch`, and adds notes to the docs
for `isconcretetype` and `isabstracttype` explaining why they aren't
antonyms.

* Test: show context when a let testset errors (#58727)

* [libblastrampoline_jll] Upgrade to v5.13.1 (#58775)

### Check list

Version numbers:
- [x] `deps/libblastrampoline.version`: `LIBNAME_VER`, `LIBNAME_BRANCH`,
`LIBNAME_SHA1` and `LIBNAME_JLL_VER`
- [x] `stdlib/libblastrampoline_jll/Project.toml`: `version`

Checksum:
- [x] `deps/checksums/libblastrampoline`

* 🤖 [master] Bump the Pkg stdlib from 5577f68d6 to e3d456127 (#58781)

Stdlib: Pkg
URL: https://github.com/JuliaLang/Pkg.jl.git
Stdlib branch: master
Julia branch: master
Old commit: 5577f68d6
New commit: e3d456127
Julia version: 1.13.0-DEV
Pkg version: 1.13.0
Bump invoked by: @KristofferC
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
https://github.com/JuliaLang/Pkg.jl/compare/5577f68d612139693282c037d070f515bf160d1b...e3d4561272fc029e9a5f940fe101ba4570fa875d

```
$ git log --oneline 5577f68d6..e3d456127
e3d456127 add update function to apps and fix a bug when adding an already installed app (#4263)
cae9ce02a Fix historical stdlib fixup if `Pkg` is in the Manifest (#4264)
a42046240 don't use tree hash from manifest if the path is set from sources (#4260)
a94a6bcae fix dev taking when the app is already installed (#4259)
313fddccb Internals: Add fallback `Base.show(::IO, ::RegistryInstance)` method (#4251)
```

Co-authored-by: KristofferC <1282691+KristofferC@users.noreply.github.com>

* prevent unnecessary repeated squaring calculation (#58720)

* LibGit2: Update to 1.9.1 (#58731)

* Unify `_checkbounds_array` into `checkbounds` and use it in more places (#58785)

Ref:
https://github.com/JuliaLang/julia/pull/58755#discussion_r2158944282.

---------

Co-authored-by: Matt Bauman <mbauman@juliahub.com>
Co-authored-by: Matt Bauman <mbauman@juliacomputing.com>

* Chained hash pipelining in array hashing (#58252)

the proposed switch in https://github.com/JuliaLang/julia/pull/57509
from `3h - hash_finalizer(x)` to `hash_finalizer(3h -x)` should increase
the hash quality of chained hashes, as the expanded expression goes from
something like `sum((-3)^k * hash(x) for k in ...)` to a
non-simplifiable composition

this does have the unfortunate impact of long chains of hashes getting a
bit slower as there is more data dependency and the CPU cannot work on
the next element's hash before combining the previous one (I think ---
I'm not particularly an expert on this low level stuff). As far as I
know this only really impacts `AbstractArray`

so, I've implemented a proposal that does some unrolling / pipelining
manually to recover `AbstractArray` hashing performance. in fact, it's
quite a lot faster now for most lengths. I tuned the thresholds (8
accumulators, certain length breakpoints) by hand on my own machine.

* Require all tuples in eachindex to have the same length. (#48125)

Potential fix for #47898

---------

Co-authored-by: navdeep rana <navdeepr@tifrh.res.in>
Co-authored-by: Oscar Smith <oscardssmith@gmail.com>
Co-authored-by: Jerry Ling <proton@jling.dev>
Co-authored-by: Andy Dienes <51664769+adienes@users.noreply.github.com>

* trailing dimensions in eachslice (#58791)

fixes https://github.com/JuliaLang/julia/issues/51692

* Allow underscore (unused) args in presence of kwargs (#58803)

Admittedly fixed because I thought I introduced this bug recently, but
actually, fix #32727. `f(_; kw) = 1` should now lower in a similar way
to `f(::Any; kw) = 1`, where we use a gensym for the first argument.

Not in this PR, but TODO: `nospecialize` underscore-only names

* codegen: slightly optimize gc-frame allocation (#58794)

Try to avoid allocating frames for some very simple function that only
have the safepoint on entry and don't define any values themselves.

* codegen: ensure safepoint functions can read the pgcstack (#58804)

This needs to be readOnly over all memory, since GC could read anything
(especially pgcstack), and which is not just argmem:read, but also the
pointer accessed from argmem that is read from.

Fix #58801

Note that this is thought not to be a problem for CleanupWriteBarriers,
since while that does read the previously-inaccessibleMemOnly state,
these functions are not marked nosync, so as long as the global state
can be read, it also must be assumed that it might observe another
thread has written to any global state.

* Revert code changes from "strengthen assume_effects doc" PR (#58289)

Reverts only the functional changes from JuliaLang/julia#58254, not the
docs. Accessing this field here assumes that the counter valid is
numeric and relevant to the current inference frame, neither of which
is intended to be true, as we continue to add interfaces to execute
methods outside of their current specific implementation with a
monotonic world counter (e.g. with invoke on a Method, with precompile
files, with external MethodTables, or with static compilation).

* build: Error when attempting to set USECLANG/USEGCC (#58795)

Way back in the good old days, these used to switch between GCC and
Clang. I guess these days we always auto-switch based on the CC value.
If you try to directly set USECLANG, things get into a bad state. Give a
better error message for that case.

* build: Add --no-same-owner to TAR (#58796)

tar changes behavior when the current uid is 0 to try to also restore
owner uids/gids (if recorded). It is possible for the uid to be 0 in
single-uid environments like user namespace sandboxes, in which case the
attempt to change the uid/gid fails. Of course ideally, the tars would
have been created non-archival (so that the uid/gid wasn't recorded in
the first place), but we get source tars from various places, so we
can't guarantee this. To make sure we don't run into trouble, manually
add the --no-same-owner flag to disable this behavior.

* Add `cfunction` support for `--trim` (#58812)

* fix error message for `eachindex(::Vararg{Tuple})` (#58811)

Make the error message in case of mismatch less confusing and consistent
with the error message for arrays.

While at it, also made other changes of the same line of source code:

* use function composition instead of an anonymous closure

* expand the one-liner into a multiline `if`

---------

Co-authored-by: Andy Dienes <51664769+adienes@users.noreply.github.com>

* use more canonical way to check binding existence (#58809)

* Add `trim_mode` parameter to JIT type-inference entrypoint (#58817)

Resolves https://github.com/JuliaLang/julia/issues/58786.

I think this is only a partial fix, since we can still end up loading
code from pkgimages that has been poorly inferred due to running without
these `InferenceParams`. However, many of the common scenarios (such as
JLL's depending on each other) seem to be OK since we have a targeted
heuristic that adds `__init__()` to a pkgimage only if the module has
inference enabled.

* codegen: gc wb for atomic FCA stores (#58792)

Need to re-load the correct `r` since issetfield skips the intcast,
resulting in no gc wb for the FCA.

Fix #58760

* codegen: relaxed jl_tls_states_t.safepoint load (#58828)

Every function with a safepoint causes spurious thread sanitizer
warnings without this change. Codegen is unaffected, except when we
build with `ThreadSanitizerPass`.

* bpart: Properly track methods with invalidated source after require_world (#58830)

There are three categories of methods we need to worry about during
staticdata validation:
1. New methods added to existing generic functions
2. New methods added to new generic functions
3. Existing methods that now have new CodeInstances

In each of these cases, we need to check whether any of the implicit
binding edges from the method's source was invalidated. Currently, we
handle this for 1 and 2 by explicitly scanning the method on load.
However, we were not tracking it for case 3. Fix that by using an extra
bit in did_scan_method that gets set when we see an existing method
getting invalidated, so we know that we need to drop the corresponding
CodeInstances during load.

Fixes #58346

* Limit --help and --help-hidden to 100 character line length (#58835)

Just fixing the command line description to make sure it is not more
than 100 characters wide as discussed with @oscardssmith in PR #54066
and PR #53759.
I also added a test to make sure that nothing more than 100 characters
is inserted.
Thank you.

* libuv: Mark `(un)preserve_handle` as `@nospecialize` (#58844)

These functions only worry about object identity, so there's no need for
them to specialize them on their type.

* add METHOD_SIG_LATEST_ONLY optimization to MethodInstance too (#58825)

Add the same optimization from Method to MethodInstance, although the
performance gain seems to be negligible in my specific testing, there
doesn't seem any likely downside to adding one caching bit to avoid some
recomputations.

* Encode fully_covers=false edges using negative of method count

This change allows edges that don't fully cover their method matches to
be properly tracked through serialization. When fully_covers is false
(indicating incomplete method coverage), we encode the method count as
negative in the edges array to signal that compactly.

* move trim patches to separate files, only load if trimming (#58826)

fixes part of #58458

* gf: Add METHOD_SIG_LATEST_HAS_NOTMORESPECIFIC dispatch status bit

This commit introduces a new dispatch status bit to track when a method
has other methods that are not more specific than it, enabling better
optimization decisions during method dispatch.

Key changes:
  1. Add METHOD_SIG_LATEST_HAS_NOTMORESPECIFIC bit to track methods with
     non-morespecific intersections
  2. Add corresponding METHOD_SIG_PRECOMPILE_HAS_NOTMORESPECIFIC bit for
     precompiled methods
  3. Refactor method insertion logic:
     - Remove morespec_unknown enum state, compute all morespec values upfront
     - Convert enum morespec_options to simple boolean logic (1/0)
     - Change 'only' from boolean to 'dispatch_bits' bitmask
     - Move dispatch status updates before early continues in the loop

* optimize verify_call again

* juliac: Add rudimentary Windows support (#57481)

This was essentially working as-is, except for our reliance on a C
compiler.

Not sure how we feel about having an `Artifacts.toml` floating around
our `contrib` folder, but I'm not aware of an alternative other than
moving `juliac.jl` to a subdirectory.

* fix null comparisons for non-standard address spaces (#58837)

Co-authored-by: Jameson Nash <vtjnash@gmail.com>

* debuginfo: Memoize object symbol lookup (#58851)

Supersedes https://github.com/JuliaLang/julia/pull/58355. Resolves
https://github.com/JuliaLang/julia/issues/58326.

On this PR:
```julia
julia> @btime lgamma(2.0)
┌ Warning: `lgamma(x::Real)` is deprecated, use `(logabsgamma(x))[1]` instead.
│   caller = var"##core#283"() at execution.jl:598
└ @ Core ~/.julia/packages/BenchmarkTools/1i1mY/src/execution.jl:598
  47.730 μs (105 allocations: 13.24 KiB)
```

On `nightly`:
```julia
julia> @btime lgamma(2.0)
┌ Warning: `lgamma(x::Real)` is deprecated, use `(logabsgamma(x))[1]` instead.
│   caller = var"##core#283"() at execution.jl:598
└ @ Core ~/.julia/packages/BenchmarkTools/1i1mY/src/execution.jl:598
  26.856 ms (89 allocations: 11.32 KiB)
```

* bpart: Skip inserting image backedges while we're generating a pkgimage (#58843)

Should speed up deeply nested precompiles by skipping unnecessary work
here.

PR is against #58830 to avoid conflicts, but semantically independent.

* Re-add old function name for backward compatibility in init (#58860)

While julia has no C-API backwards compatibility guarantees this is
simple enough to add.

Fixes #58859

* trimming: Add `_uv_hook_close` support (#58871)

Resolves https://github.com/JuliaLang/julia/issues/58862.

Since this hook is called internally by the runtime, `--trim` was not
aware of the callee edge required here.

* Don't `@inbounds` AbstractArray's iterate method; optimize `checkbounds` instead (#58793)

Split off from #58785, this simplifies `iterate` and removes the
`@inbounds` call that was added in
https://github.com/JuliaLang/julia/pull/58635. It achieves the same (or
better!) performance, however, by targeting optimizations in
`checkbounds` and — in particular — the construction of a linear
`eachindex` (against which the bounds are checked).

---------

Co-authored-by: Mosè Giordano <mose@gnu.org>

* aotcompile: Fix early-exit if CI not found for `cfunction` (#58722)

As written, this was accidentally skipping all the subsequent `cfuncs`
that need adapters.

* zero-index get/setindex(::ReinterpretArray) require a length of 1 (#58814)

fix https://github.com/JuliaLang/julia/issues/58232

o3 helped me understand the existing implementations but code is mine

---------

Co-authored-by: Matt Bauman <mbauman@gmail.com>

* Add `Base.isprecompilable` (#58805)

Alternative to https://github.com/JuliaLang/julia/pull/58146.

We want to compile a subset of the possible specializations of a
function. To this end, we have a number of manually written `precompile`
statements. Creating this list is, unfortunately, error-prone, and the
list is also liable to going stale. Thus we'd like to validate each
`precompile` statement in the list.

The simple answer is, of course, to actually run the `precompile`s, and
we naturally do so, but this takes time.

We would like a relatively quick way to check the validity of a
`precompile` statement.
This is a dev-loop optimization, to allow us to check "is-precompilable"
in unit tests.

We can't use `hasmethod` as it has both false positives (too loose):
```julia
julia> hasmethod(sum, (AbstractVector,))
true

julia> precompile(sum, (AbstractVector,))
false

julia> Base.isprecompilable(sum, (AbstractVector,)) # <- this PR
false
```
and also false negatives (too strict):
```julia
julia> bar(@nospecialize(x::AbstractVector{Int})) = 42
bar (generic function with 1 method)

julia> hasmethod(bar, (AbstractVector,))
false

julia> precompile(bar, (AbstractVector,))
true

julia> Base.isprecompilable(bar, (AbstractVector,)) # <- this PR
true
```
We can't use `hasmethod && isconcretetype` as it has false negatives
(too strict):
```julia
julia> has_concrete_method(f, argtypes) = all(isconcretetype, argtypes) && hasmethod(f, argtypes)
has_concrete_method (generic function with 1 method)

julia> has_concrete_method(bar, (AbstractVector,))
false

julia> has_concrete_method(convert, (Type{Int}, Int32))
false

julia> precompile(convert, (Type{Int}, Int32))
true

julia> Base.isprecompilable(convert, (Type{Int}, Int32))  # <- this PR
true
```
`Base.isprecompilable` is essentially `precompile` without the actual
compilation.

* Add a `similar` method for `Type{<:CodeUnits}` (#57826)

Currently, `similar(::CodeUnits)` works as expected by going through the
generic `AbstractArray` method. However, the fallback method hit by
`similar(::Type{<:CodeUnits}, dims)` does not work, as it assumes the
existence of a constructor that accepts an `UndefInitializer`. This can
be made to work by defining a corresponding `similar` method that
returns an `Array`.

One could make a case that this is a bugfix since it was arguably a bug
that this method didn't work given that `CodeUnits` is an
`AbstractArray` subtype and the other `similar` methods work. If anybody
buys that argument, it could be nice to backport this; it came up in
some internal code that uses Arrow.jl and JSON3.jl together.

* 🤖 [master] Bump the Pkg stdlib from e3d456127 to 109eaea66 (#58858)

Stdlib: Pkg
URL: https://github.com/JuliaLang/Pkg.jl.git
Stdlib branch: master
Julia branch: master
Old commit: e3d456127
New commit: 109eaea66
Julia version: 1.13.0-DEV
Pkg version: 1.13.0
Bump invoked by: @KristofferC
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
https://github.com/JuliaLang/Pkg.jl/compare/e3d4561272fc029e9a5f940fe101ba4570fa875d...109eaea66a0adb0ad8fa497e64913eadc2248ad1

```
$ git log --oneline e3d456127..109eaea66
109eaea66 Various app improvements (#4278)
25c2390ed feat(apps): Add support for multiple apps per package via submodules (#4277)
c78b40b35 copy the app project instead of wrapping it (#4276)
d2e61025b Fix leading whitespace in REPL commands with comma-separated packages (#4274)
e02bcabd7 Registry: Properly pass down `depot` (#4268)
e9a055240 fix what project file to look at when package without path but with a subdir is devved by name (#4271)
8b1f0b9ff prompt for confirmation before removing compat entry (#4254)
eefbef649 feat(errors): Improve error message for incorrect package UUID (#4270)
4d1c6b0a3 explain no reg installed when no reg installed (#4261)
```

Co-authored-by: KristofferC <1282691+KristofferC@users.noreply.github.com>

* fix a few tiny JET linter issues (#58869)

* Fix data race in jl_new_module__ (#58880)

Use an atomic fetch and add to fix a data race in `Module()` identified
by tsan:

```
./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end'
==================
WARNING: ThreadSanitizer: data race (pid=5575)
  Write of size 4 at 0xffff9bf9bd28 by thread T9:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    #3 <null> <null> (0xffff76a21164)
    #4 <null> <null> (0xffff76a1f074)
    #5 <null> <null> (0xffff76a1f0c4)
    #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Previous write of size 4 at 0xffff9bf9bd28 by thread T10:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    #3 <null> <null> (0xffff76a21164)
    #4 <null> <null> (0xffff76a1f074)
    #5 <null> <null> (0xffff76a1f0c4)
    #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28)
```

* fix trailing indices stackoverflow in reinterpreted array (#58293)

would fix https://github.com/JuliaLang/julia/issues/57170, fix
https://github.com/JuliaLang/julia/issues/54623

@nanosoldier `runbenchmarks("array", vs=":master")`

* Add missing module qualifier (#58877)

A very simple fix addressing the following bug:
```julia
Validation: Error During Test at REPL[61]:1
  Got exception outside of a @test
  #=ERROR showing exception stack=# UndefVarError: `get_ci_mi` not defined in `Base.StackTraces`
  Suggestion: check for spelling errors or missing imports.
  Hint: a global variable of this name also exists in Base.
      - Also declared public in Compiler (loaded but not imported in Main).
  Stacktrace:
    [1] show_custom_spec_sig(io::IOContext{IOBuffer}, owner::Any, linfo::Core.CodeInstance, frame::Base.StackTraces.StackFrame)
      @ Base.StackTraces ./stacktraces.jl:293
    [2] show_spec_linfo(io::IOContext{IOBuffer}, frame::Base.StackTraces.StackFrame)
      @ Base.StackTraces ./stacktraces.jl:278
    [3] print_stackframe(io::IOContext{IOBuffer}, i::Int64, frame::Base.StackTraces.StackFrame, n::Int64, ndigits_max::Int64, modulecolor::Symbol; prefix::Nothing)
      @ Base ./errorshow.jl:786
```

AFAIK this occurs when printing a stacktrace from a `CodeInstance` that
has a non-default owner.

* OpenSSL: Update to 3.5.1 (#58876)

Update the stdlib OpenSSL to 3.5.1.

This is a candidate for backporting to Julia 1.12 if there is another
beta release.

* `setindex!(::ReinterpretArray, v)` needs to convert before reinterpreting (#58867)

Found in
https://github.com/JuliaLang/julia/pull/58814#discussion_r2169155093.

Previously, in a very limited situation (a zero-dimensional reinterpret
that reinterprets between primitive types that was setindex!'ed with
zero indices), we omitted the `convert`. I believe this was an
unintentional oversight, and hopefully nobody is depending on this
behavior.

* Support `debuginfo` context option in IRShow for `IRCode`/`IncrementalCompact` (#58642)

This allows us to get complete source information during printing for
`IRCode` and `IncrementalCompact`, same as we do by default with
`CodeInfo`.

The user previously had to do:
```julia
Compiler.IRShow.show_ir(stdout, ir, Compiler.IRShow.default_config(ir; verbose_linetable=true))
```

and now, they only need to do:
```julia
show(IOContext(stdout, :debuginfo => :source), ir)
```

* Add offset in `hvncat` dimension calculation to fix issue with 0-length elements in first dimension (#58881)

* fix `setindex(::ReinterpretArray,...)` for zero-d arrays (#58868)

by copying the way getindex works.

Found in
https://github.com/JuliaLang/julia/pull/58814#discussion_r2178243259

---------

Co-authored-by: Andy Dienes <51664769+adienes@users.noreply.github.com>

* add back `to_power_type` to `deprecated.jl` since some packages call it (#58886)

Co-authored-by: KristofferC <kristoffer.carlsson@juliacomputing.com>

* Pkg: Allow configuring can_fancyprint(io::IO) using IOContext (#58887)

* Make `Base.donotdelete` public (#55774)

I rely on `Base.donotdelete` in
[Chairmarks.jl](https://chairmarks.lilithhafner.com) and I'd like it to
be public. I imagine that other benchmarking tools also rely on it. It's
been around since 1.8 (see also: #55773) and I think we should commit to
keeping it functional for the rest of 1.x.

* Add link to video in profiling manual (#58896)

* Stop documenting that `permute!` is "in-place"; it isn't and never has been non-allocating (#58902)

* faster iteration over a `Flatten` of heterogenous iterators (#58522)

seems to help in many cases. would fix the precise MWE given in
https://github.com/JuliaLang/julia/issues/52552, but does not
necessarily fix comprehensively all perf issues of all heterogenous
flattens. but, may as well be better when it's possible


setup:
```
julia> using BenchmarkTools

julia> A = rand(Int, 100000); B = 1:100000;

julia> function g(it)
           s = 0
           for i in it
               s += i
           end
           s
       end
```

before:
```
julia> @btime g($(Iterators.flatten((A, B))))
  12.461 ms (698979 allocations: 18.29 MiB)

julia> @btime g($(Iterators.flatten(i for i in (A, B))))
  12.393 ms (698979 allocations: 18.29 MiB)

julia> @btime g($(Iterators.flatten([A, B])))
  15.115 ms (999494 allocations: 25.93 MiB)

julia> @btime g($(Iterators.flatten((A, Iterators.flatten((A, B))))))
  82.585 ms (2997964 allocations: 106.78 MiB)
```

after:
```
julia> @btime g($(Iterators.flatten((A, B))))
  135.958 μs (2 allocations: 64 bytes)

julia> @btime g($(Iterators.flatten(i for i in (A, B))))
  149.500 μs (2 allocations: 64 bytes)

julia> @btime g($(Iterators.flatten([A, B])))
  17.130 ms (999498 allocations: 25.93 MiB)

julia> @btime g($(Iterators.flatten((A, Iterators.flatten((A, B))))))
  13.716 ms (398983 allocations: 10.67 MiB)
```

* Make `hypot` docs example more type stable (#58918)

* Markdown: Make `Table`/`LaTeX` objects subtypes of `MarkdownElement` (#58916)

These objects satisfy the requirements of the `MarkdownElement`
interface (such as implementing `Markdown.plain`), so they should be
subtypes of `MarkdownElement`. This is convenient when defining
functions for `MarkdownElement` in other packages.

* Support "Functor-like" `code_typed` invocation (#57911)

This lets you easily inspect IR associated with "Functor-like" methods:
```julia
julia> (f::Foo)(offset::Float64) = f.x + f.y + offset
julia> code_typed((Foo, Float64))
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Base.getfield(f, :x)::Int64
│   %2 = Base.getfield(f, :y)::Int64
│   %3 = Base.add_int(%1, %2)::Int64
│   %4 = Base.sitofp(Float64, %3)::Float64
│   %5 = Base.add_float(%4, offset)::Float64
└──      return %5
) => Float64
```

This is just a small convenience over `code_typed_by_type`, but I'm in
support of it (even though it technically changes the meaning of, e.g.,
`code_typed((1, 2))` which without this PR inspects
`(::Tuple{Int,Int})(::Vararg{Any})`

We should probably update all of our reflection machinery (`code_llvm`,
`code_lowered`, `methodinstance`, etc.) to support this "non-arg0" style
as well, but I wanted to open this first to make sure folks like it.

* IRShow: Print arg0 type when necessary to disambiguate `invoke` (#58893)

When invoking any "functor-like", such as a closure:
```julia
bar(x) = @noinline ((y)->x+y)(x)
```
our IR printing was not showing the arg0 invoked, even when it is
required to determine which MethodInstance this is invoking.

Before:
```julia
julia> @code_typed optimize=true bar(1)
CodeInfo(
1 ─ %1 = %new(var"#bar##2#bar##3"{Int64}, x)::var"#bar##2#bar##3"{Int64}
│   %2 =    invoke %1(x::Int64)::Int64
└──      return %2
) => Int64
```

After:
```julia
julia> @code_typed optimize=true bar(1)
CodeInfo(
1 ─ %1 = %new(var"#bar##2#bar##3"{Int64}, x)::var"#bar##2#bar##3"{Int64}
│   %2 =    invoke (%1::var"#bar##2#bar##3"{Int64})(x::Int64)::Int64
└──      return %2
) => Int64
```

* Support "functors" for code reflection utilities (#58891)

As a follow-up to https://github.com/JuliaLang/julia/pull/57911, this
updates:
 - `Base.method_instance`
 - `Base.method_instances`
 - `Base.code_ircode`
 - `Base.code_lowered`
 - `InteractiveUtils.code_llvm`
 - `InteractiveUtils.code_native`
 - `InteractiveUtils.code_warntype`
 
 to support "functor" invocations.
 
e.g. `code_llvm((Foo, Int, Int))` which corresponds to `(::Foo)(::Int,
::Int)`

* Prevent data races in invalidate_code_for_globalref!

* Fix type instability in invalidate_code_for_globalref!

* Add the fact that functions ending with `!` may allocate to the FAQ (#58904)

I've run into this question several times, that might count as
"frequently asked".

* Economy mode REPL: run the event loop with jl_uv_flush (#58926)

`ios_flush` won't wait for the `jl_static_show` from the previous
evaluation to complete, resulting in the output being interleaved with
subsequent REPL outputs. Anything that produces a lot of output will
trigger it, like `Core.GlobalMethods.defs`.

* Fix grammar, typos, and formatting issues in docstrings (#58944)

Co-authored-by: Claude <noreply@anthropic.com>

* Fix nthreadpools size in JLOptions (#58937)

* NFC: Remove duplicate `julia-src-%` dependency in makefile (#58947)

* Improve error message for missing dependencies in packages (#58878)

* Make current_terminfo a OncePerProcess (#58854)

There seems to be no reason to always load this unconditionally -
especially since it's in the critical startup path. If we never print
colored output or our IO is not a TTY, we don't need to load this at
all. While we're at it, remove the `term_type` argument to
`ttyhascolor`, which didn't work as advertised anyway, since it still
looked at the current_terminfo. If clients want to do a full TermInfo
check, they can do that explicitly.

(Written by Claude Code)

* chore: remove redundant words in comment (#58955)

* add a precompile workload to TOML (#58949)

* 🤖 [master] Bump the NetworkOptions stdlib from c090626 to 532992f (#58882)

Co-authored-by: DilumAluthge <5619885+DilumAluthge@users.noreply.github.com>

* remove excessive code from trim script (#58853)

Co-authored-by: gbaraldi <baraldigabriel@gmail.com>

* Add juliac Artifacts.toml in Makefile (#58936)

* staticdata: Don't discard inlineable code that inference may need (#58842)

See
https://github.com/JuliaLang/julia/issues/58841#issuecomment-3014833096.
We were accidentally discarding inferred code during staticdata
preparation that we would need immediately afterwards to satisfy
inlining requests during code generation for the system image. This was
resulting in spurious extra compilation at the first inference after
sysimage reload. Additionally it was likely causing various unnecessary
dispatch slow paths in the generated inference code. Fixes #58841.

* clear up `isdone` docstring (#58958)

I got pretty confused on my first reading of this docstring because for
some reason I thought it was saying that `isdone(itr, state) == missing`
implied that it was true that `iterate(itr, state) === nothing` (aka
that `state` is indeed final). which of course is wrong and doesn't make
sense, but it's still how I read it. I think the new docstring is a bit
more explicit.

* shield `_artifact_str` function behind a world age barrier (#58957)

We already do this for `require` in Base loading, it probably makes
sense to do this here as well, as invalidating this function easily adds
+1s in load time for a jll. Avoids the big load time penalty from
loading IntelOpenMP_jll in
https://github.com/JuliaLang/julia/issues/57436#issuecomment-3052258775.

Before:

```
julia> @time using ModelingToolkit
  6.546844 seconds (16.09 M allocations: 938.530 MiB, 11.13% gc time, 16.35% compilation time: 12% of which was recompilation)
```

After:

```
julia> @time using ModelingToolkit
  5.637914 seconds (8.26 M allocations: 533.694 MiB, 11.47% gc time, 3.11% compilation time: 17% of which was recompilation)
```

---------

Co-authored-by: KristofferC <kristoffer.carlsson@juliacomputing.com>
Co-authored-by: Cody Tapscott <84105208+topolarity@users.noreply.github.com>

* doc: Fix grammar, typos, and formatting issues across documentation (#58932)

Co-authored-by: Claude <noreply@anthropic.com>

* Replace Base.Workqueues with a OncePerThread (#58941)

Simplify `workqueue_for`. While not strictly necessary, the acquire load
in `getindex(once::OncePerThread{T,F}, tid::Integer)` makes
ThreadSanitizer happy. With the existing implementation, we get false
positives whenever a thread other than the one that originally allocated
the array reads it:

```
==================
WARNING: ThreadSanitizer: data race (pid=6819)
  Atomic read of size 8 at 0xffff86bec058 by main thread:
    #0 getproperty Base_compiler.jl:57 (sys.so+0x113b478)
    #1 julia_pushNOT._1925 task.jl:868 (sys.so+0x113b478)
    #2 julia_enq_work_1896 task.jl:969 (sys.so+0x5cd218)
    #3 schedule task.jl:983 (sys.so+0x892294)
    #4 macro expansion threadingconstructs.jl:522 (sys.so+0x892294)
    #5 julia_start_profile_listener_60681 Base.jl:355 (sys.so+0x892294)
    #6 julia___init___60641 Base.jl:392 (sys.so+0x1178dc)
    #7 jfptr___init___60642 <null> (sys.so+0x118134)
    #8 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
    #9 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
    #10 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0xbba74)
    #11 jl_module_run_initializer /home/user/c/julia/src/toplevel.c:68:13 (libjulia-internal.so.1.13+0xbba74)
    #12 _finish_jl_init_ /home/user/c/julia/src/init.c:632:13 (libjulia-internal.so.1.13+0x9c0fc)
    #13 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
    #14 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
    #15 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
    #16 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)

  Previous write of size 8 at 0xffff86bec058 by thread T2:
    #0 IntrusiveLinkedListSynchronized task.jl:863 (sys.so+0x78d220)
    #1 macro expansion task.jl:932 (sys.so+0x78d220)
    #2 macro expansion lock.jl:376 (sys.so+0x78d220)
    #3 julia_workqueue_for_1933 task.jl:924 (sys.so+0x78d220)
    #4 julia_wait_2048 task.jl:1204 (sys.so+0x6255ac)
    #5 julia_task_done_hook_49205 task.jl:839 (sys.so+0x128fdc0)
    #6 jfptr_task_done_hook_49206 <null> (sys.so+0x902218)
    #7 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
    #8 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
    #9 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9c79c)
    #10 jl_finish_task /home/user/c/julia/src/task.c:345:13 (libjulia-internal.so.1.13+0x9c79c)
    #11 jl_threadfun /home/user/c/julia/src/scheduler.c:122:5 (libjulia-internal.so.1.13+0xe7db8)

  Thread T2 (tid=6824, running) created by main thread at:
    #0 pthread_create <null> (julia+0x85f88)
    #1 uv_thread_create_ex /workspace/srcdir/libuv/src/unix/thread.c:172 (libjulia-internal.so.1.13+0x1a8d70)
    #2 _finish_jl_init_ /home/user/c/julia/src/init.c:618:5 (libjulia-internal.so.1.13+0x9c010)
    #3 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
    #4 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
    #5 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
    #6 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)

SUMMARY: ThreadSanitizer: data race Base_compiler.jl:57 in getproperty
==================
```

* Fix `hygienic-scope`s in inner macro expansions (#58965)

Changes from https://github.com/JuliaLang/julia/pull/43151, github just
didn't want me to re-open it.

As discussed on slack, any `hygienic-scope` within an outer
`hygienic-scope` can read and write variables in the outer one, so it's
not particularly hygienic. The result is that we can't safely nest macro
calls unless they know the contents of all inner macro calls.

Should fix #48910.

Co-authored-by: Michiel Dral <m.c.dral@gmail.com>

* remove comment from julia-syntax that is no longer true (#58964)

The code this referred to was removed by
c6c3d72d1cbddb3d27e0df0e739bb27dd709a413

* expand memoryrefnew capabilities (#58768)

The goal here is 2-fold. Firstly, this should let us simplify the
boundscheck (not yet implimented), but this also should reduce Julia IR
side a bit.

* Add news entry and update docstring for #58727 (#58973)

* Fix alignment of failed precompile jobs on CI (#58971)

* bpart: Tweak `isdefinedglobal` on backdated constant (#58976)

In d2cc06193ef4161e4ac161bd4b5b57a51686a89a and prior commits, we made
backdated access a conditional error (if depwarns are enabled or in
generators). However, we did not touch `isdefinedglobal`. This resulted
in the common pattern `isdefinedglobal(m, s) && getglobal(m, s)` to
sometimes error. In particular, this could be observed when attempting
to print a type from inside a generated function before that type's
definition age.

Additionally, I think the usage there, which used `invokelatest` on each
of the two queries is problematic because it is racy, since the two
`invokelatest` calls may be looking at different world ages.

This makes two tweaks:
1. Makes `isdefinedglobal` consistent with `getglobal` in that it now
returns false if `getglobal` would throw due to the above referenced
restriction.
2. Removes the implicit `invokelatest` in _isself in the show code.
Instead, it will use the current world. I considered having it use the
exception age when used for MethodErrors. However, because this is used
for printing it matters more how the object can be accessed *now* rather
than how it could have been accessed in the past.

* Fix precompilepkgs warn loaded setting (#58978)

* specify that `Iterators.rest` must be given a valid `state` (#58962)

~currently `Iterators.rest(1:2, 3)` creates an infinite loop. after this
PR it would be an `ArgumentError`~

docs only now

* stdlib/Dates: Fix doctest regex to handle DateTime with 0 microseconds (#58981)

The `now(UTC)` doctest can fail when the DateTime has exactly 0
milliseconds, as the output format omits the fractional seconds entirely
(e.g., "2023-01-04T10:52:24" instead of "2023-01-04T10:52:24.000").

Update the regex filter to make the milliseconds portion optional by
using `(\\.\\d{3})?` instead of `\\.\\d{3}`.

Fixes CI failure:
https://buildkite.com/julialang/julia-master/builds/49144#0197fd72-d1c6-44d6-9c59-5f548ab98f04

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Keno Fischer <Keno@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>

* Fix unique for range wrappers with zero step (#51004)

The current implementation assumes that the vector indexing
`r[begin:begin]` is specialized to return a range, which isn't the case
by default. As a consequence,
```julia
julia> struct MyStepRangeLen{T,R} <: AbstractRange{T}
           x :: R
       end

julia> MyStepRangeLen(s::StepRangeLen{T}) where {T} = MyStepRangeLen{T,typeof(s)}(s)
MyStepRangeLen

julia> Base.first(s::MyStepRangeLen) = first(s.x)

julia> Base.last(s::MyStepRangeLen) = last(s.x)

julia> Base.length(s::MyStepRangeLen) = length(s.x)

julia> Base.step(s::MyStepRangeLen) = step(s.x)

julia> r = MyStepRangeLen(StepRangeLen(1,0,4))
1:0:1

julia> unique(r)
ERROR: MethodError: Cannot `convert` an object of type Vector{Int64} to an object of type MyStepRangeLen{Int64, Int64, StepRangeLen{Int64, Int64, Int64, Int64}}
[...]
```
This PR fixes this by using constructing a `UnitRange` instead of the
indexing operation. After this, we obtain
```julia
julia> unique(r)
1:1:1
```
In principle, the `step` should be preserved, but `range(r[begin]::Int,
step=step(r), length=length(r))` appears to error at present, as it
tries to construct a `StepRange` instead of a `StepRangeLen`.

This fix isn't perfect as it assumes that the conversion from a
`UnitRange` _is_ defined, which is also not the case by default. For
example, the following still won't work:
```julia
julia> struct MyRange <: AbstractRange{Int} end

julia> Base.first(x::MyRange) = 1

julia> Base.last(x::MyRange) = 1

julia> Base.length(x::MyRange) = 3

julia> Base.step(x::MyRange) = 0

julia> unique(MyRange())
ERROR: MethodError: no method matching MyRange(::UnitRange{Int64})
[...]
```
In fact, if the indexing `MyRange()[begin:begin]` has been specialized
but the conversion from a `UnitRange` isn't, then this is actually a
regression. I'm unsure if such pathological cases are common, though.
The reason the first example works is that the conversion for a range
wrapper is defined implicitly if the parent type supports conversion
from a `UnitRange`.

* Docs: add GC user docs (#58733)

Co-authored-by: Andy Dienes <51664769+adienes@users.noreply.github.com>
Co-authored-by: Gabriel Baraldi <28694980+gbaraldi@users.noreply.github.com>
Co-authored-by: Diogo Netto <61364108+d-netto@users.noreply.github.com>

* 🤖 [master] Bump the Pkg stdlib from 109eaea66 to b85e29428 (#58991)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* Add one-argument `argtypes` methods to source reflection functions (#58925)

Follow-up to
https://github.com/JuliaLang/julia/pull/58891#issuecomment-3036419509,
extending the feature to `which`, `functionloc`, `edit` and `less`.

* Test: Add compiler hint for `ts` variable definedness in `@testset for` (#58989)

Helps the new language server avoid reporting unused variable reports.

* trimming: explictly add Libdl dep for test/trimming/basic_jll.jl (#58990)

* win/msys2: Automatically switch msys2 symlinks mode for LLVM (#58988)

As noted in
https://github.com/JuliaLang/julia/issues/54981#issuecomment-2336444226,
msys2 currently fails to untar an llvm source build. Fix that by setting
the appropriate environment variable to switch the symlinks mode.

* Fix order of MSYS rules (#58999)

git-external changes the LLVM_SRC_DIR variable, so the target-specific
variable applies to the wrong target if defined before it - didn't
notice in local testing because I had accidentally switched the variable
globally earlier for testing - but showed up on a fresh build.

* msys2: Recommend correct cmake package (#59001)

msys2 ships 2 different cmake packages, one built natively (with mingw
prefix in the package name) and one built against the posix emulation
environment. The posix emulation one does not work because it will
detect unix-style paths, which it then writes into files that native
tools process. Unlike during command invocation (where the msys2 runtime
library does path translation), when paths are written to files, they
are written verbatim.

The practical result of this is that e.g. the LLVM build will fail with
a mysterious libz link failure (as e.g. reported in #54981).

This is our fault, because our built instructions tell the user to
install the wrong one.

Fix all that by
1. Correcting the build instructions to install the correct cmake
2. Detecting if the wrong cmake is installed and advising the correct
one
3. Fixing an issue where the native CMake did not like our
CMAKE_C_COMPILER setting.

With all this, CMake runs correctly under msys2 with
USE_BINARYBUILDER_LLVM=0.

* feat(REPL): Added `active_module` context to numbered REPL (#59000)

* optimize `length(::OrdinalRange)` for large bit-ints (#58864)

Split from #58793, this coalesces nearly all the branches in `length`,
allowing it to inline and generally perform much better while retaining
the exact same functionality.

---------

Co-authored-by: N5N3 <2642243996@qq.com>

* Fix LLVM TaskDispatcher implementation issues (#58950)

Fixes #58229 (LLVM JITLink stack overflow issue)

I tried submitting this promise/future implementation upstream
(https://github.com/llvm/llvm-project/compare/main...vtjnash:llvm-project:jn/cowait-jit)
so that I would not need to duplicate nearly as much code here to fix
this bug, but upstream is currently opposed to fixing this bug and
instead insists it is preferable for each downstream project to
implement this fix themselves adding extra maintenance burden for us for
now. Sigh.

* Improve --trace-dispatch coverage: emit in "full-cache" fast path as well. (#59012)

This PR moves the `--trace-dispatch` logging inside `jl_lookup_generic_`
from only the `cache miss case` to also logging it inside the `no method
was found in the associative cache, check the full cache` case.

This PR logs the data from inside each of the two slow-path cases.

* MozillaCACerts: Update to 2025-07-15 (#59010)

* Fix use-after-free in FileWatching (#59017)

We observe an abort on Windows on Revise master CI, where a free'd
handle is passed to jl_close_uv. The root cause is that uv_fseventscb_file
called uvfinalize earlier, but did not set the handle to NULL, so when the
actual finalizer ran later, it would see corrupted state.

* Roll up msys2/clang/windows build fixes (#59003)

This rolls up everything I had to change to get a successful source
build of Julia under msys2. It's a misc collection of msys2, clang and
other fixes. With this, I can use the following Make.user:

```
USE_SYSTEM_CSL=1
USE_BINARYBUILDER_LLVM=0
CC=clang
CXX=clang++
FC=gfortran
```

The default USE_SYSTEM_CSL is broken due to #56840 With
USE_SYSTEM_CSL=1, LLVM is broken due to #57021 Clang is required because
gcc can't do an LLVM source build due to known export symbol size limits
(ref JuliaPackaging/Yggdrasil#11652).

That said, if we address the ABI issues in #56840, the default Make.user
should build again (with BB-provided LLVM).

* Fix tar command (#59026)

Scheduled build failing with 
```
cd [buildroot]/deps/srccache/ && /usr/bin/tar --no-same-owner -xfz [buildroot]/deps/srccache/libunwind-1.8.2.tar.gz
/usr/bin/tar: z: Cannot open: No such file or directory
```
Issue probably introduced in
https://github.com/JuliaLang/julia/pull/58796.

According to chatgpt this will fix it

* Add 'sysimage' keyword for `JULIA_CPU_TARGET` to match (or extend) the sysimage target (#58970)

* add `@__FUNCTION__` and `Expr(:thisfunction)` as generic function self-reference (#58940)

This PR adds `@__FUNCTION__` to match the naming conventions of existing reflection macros (`@__MODULE__`, `@__FILE__`, etc.).

---------

Co-authored-by: Jeff Bezanson <jeff.bezanson@gmail.com>

* Bugfix: Use Base.aligned_sizeof instead of sizeof in Mmap.mmap (#58998)

fix #58982

* Fix PR reference in NEWS (#59046)

* 🤖 [master] Bump the LibCURL stdlib from a65b64f to 038790a (#59038)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* 🤖 [master] Bump the DelimitedFiles stdlib from db79c84 to a982d5c (#59036)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* 🤖 [master] Bump the SHA stdlib from 4451e13 to 169a336 (#59041)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* 🤖 [master] Bump the Pkg stdlib from b85e29428 to 38d2b366a (#59040)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* 🤖 [master] Bump the Statistics stdlib from 77bd570 to 22dee82 (#59043)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* Expand JULIA_CPU_TARGET docs (#58968)

* 🤖 [master] Bump the LinearAlgebra stdlib from 3e4d569 to 2c3fe9b (#59039)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>
Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>

* 🤖 [master] Bump the SparseArrays stdlib from 6d072a8 to 30201ab (#59042)

* 🤖 [master] Bump the JuliaSyntaxHighlighting stdlib from f803fb0 to b666d3c (#59037)

* stored method interference graph (#58948)

Store full method interference relationship graph in interferences field
of Method to avoid expensive morespecific calls during dispatch. This
provides significant performance improvements:
  - Replace method comparisons with precomputed interference lookup.
  - Optimize ml_matches minmax computation using interference lookups.
  - Optimize sort_mlmatches for large return sets by iterating over
    interferences instead of all matching methods.
  - Add method_morespecific_via_interferences in both C and Julia.

This representation may exclude some edges that are implied by
transitivity since sort_mlmatches will ensure the correct result by
following strong edges. Ambiguous edges are guaranteed to be checkable
without recursion.

Also fix a variety of bugs along the way:
 - Builtins signature would cause them to try to discard all other
   methods during `sort_mlmatches`.
 - Some ambiguities were over-estimated, which now are improved upon.
 - Setting lim==-1 now gives the same limited list of methods as lim>0,
   since that is actually faster now than attempting to give the
   unsorted list. This provides a better fix to #53814 than #57837 and
   fixes #58766.
 - Reverts recent METHOD_SIG_LATEST_HAS_NOTMORESPECIFIC attempt (though
   not the whole commit), since I found a significant problem with any
   usage of that bit during testing: it only tracks methods that
   intersect with a target, but new methods do not necessarily intersect
   with any existing target.

This provides a decent performance improvement to `methods` calls, which
implies a decent speed up to package loading also (e.g. ModelingToolkit
loads in about 4 seconds instead of 5 seconds).

* build/llvm: Remove bash-specific curly expansion (#59058)

Fixes #59050

* build: More msys2 fixes (#59028)

* remove a testset from MMAP that might cause CI to now fail on Windows (#59062)

* Use a dedicated parameter attribute to identify the gstack arg. (#59059)

Otherwise, on systems without SwitfCC support (i.e. RISC-V)
`getPGCstack` may return null, disabling the final GC pass.

* skip unnecessary alias-check in `collect(::AbstractArray)` from `copyto!` (#55748)

As discussed on Slack with @MasonProtter & @jakobnissen, `collect`
currently does a usually cheap - but sometimes expensive - aliasing
check (via `unalias`->`mightalias`->`dataid` -> `objectid`) before
copying contents over; this check is unnecessary, however, since the
source array is newly created and cannot possibly alias the input.
This PR fixes that by swapping from `copyto!` to `copyto_unaliased!` in
the `_collect_indices` implementations where the swap is straightforward
(e.g., it is not so straightforward for the fallback
`_collect_indices(indsA, A)`, so I skipped it there).

This improves the following example substantially:
```jl
struct GarbageVector{N} <: AbstractVector{Int}
    v :: Vector{Int}
    garbage :: NTuple{N, Int}
end
GarbageVector{N}(v::Vector{Int}) where N = GarbageVector{N}(v, ntuple(identity, Val(N)))
Base.getindex(gv::GarbageVector, i::Int) = gv.v[i]
Base.size(gv::GarbageVector) = size(gv.v)

using BenchmarkTools
v = rand(Int, 10)
gv = GarbageVector{100}(v)
@btime collect($v);  # 30 ns (v1.10.4)  -> 30 ns (PR)
@btime collect($gv); # 179 ns (v1.10.4) -> 30 ns (PR)
```

Relatedly, it seems the fact that `mightalias` is comparing immutable
contents as well - and hence slowing down the `unalias` check for the
above `GarbageVector` via a slow `objectid` on tuples - is suboptimal. I
don't know how to fix that though, so I'd like to leave that outside
this PR. (Probably related to
https://github.com/JuliaLang/julia/pull/26237)

Co-authored-by: Matt Bauman <mbauman@juliahub.com>

* Fix and update Revise manifest (#59077)

* 🤖 [master] Bump the Pkg stdlib from 38d2b366a to 542ca0caf (#59083)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* Do not needlessly disable CPU features. (#59080)

On QEMU's RISC-V cpu, LLVM's `getHostCPUFeatures` reports:

```
+zksed,+zkne,+zksh,+zfh,+zfhmin,+zacas,+v,+f,+c,+zvknha,+a,+zfa,+ztso,+zicond,+zihintntl,+zvbb,+zvksh,+zvkg,+zbkb,+zvkned,+zvbc,+zbb,+zvfhmin,+zbkc,+d,+i,+zknh,+zicboz,+zbs,+zvksed,+zbc,+zba,+zvknhb,+zknd,+zvkt,+zbkx,+zkt,+zvfh,+zvkb,+m
```

We change that to:

```
+zksed,+zkne,+zksh,+zfh,+zfhmin,+zacas,+v,+f,+c,+zvknha,+a,+zfa,+ztso,+zicond,+zihintntl,+zvbb,+zvksh,+zvkg,+zbkb,+zvkned,+zvbc,+zbb,+zvfhmin,+zbkc,+d,+i,+zknh,+zicboz,+zbs,+zvksed,+zbc,+zba,+zvknhb,+zknd,+zvkt,+zbkx,+zkt,+zvfh,+zvkb,+m,-zcmop,-zca,-zcd,-zcb,-zve64d,-zve64x,-zve64f,-zawrs,-zve32x,-zimop,-zihintpause,-zcf,-zve32f
```

i.e. we add
`-zcmop,-zca,-zcd,-zcb,-zve64d,-zve64x,-zve64f,-zawrs,-zve32x,-zimop,-zihintpause,-zcf,-zve32f`,
disabling stuff `zve*` after first enabling `v` (which includes
`zvl*b`). That's not valid:

```
LLVM ERROR: 'zvl*b' requires 'v' or 'zve*' extension to also be specified
```

... so disable this post-processing of LLVM feature sets and trust what
it spits out. AFAICT this only matters for the fallback path of
`processor.cpp`, so shouldn't impact most users.

* build: Also pass -fno-strict-aliasing for C++ (#59066)

As diagnosed by Andrew Pinski
(https://github.com/JuliaLang/julia/issues/58466#issuecomment-3105141193),
we are not respecting strict aliasing currently. We turn this off for C,
but the flag appears to be missing for C++. Looks like it's been that
way ever since that flag was first added to our build system (#484). We
should probably consider running TypeSanitizer over our code base to see
if we can make our code correct under strict aliasing as compilers are
increasingly taking advantage of it.

Fixes #58466

* Fix typo in `include`'s docstring (#59055)

* results.json: Fix repo paths so links to github work (#59090)

* Update RISC-V building docs. (#59088)

We have pre-built binaries for RISC-V now.

* Test: improve type stabilities (#59082)

Also simplifies code a bit, by removing unnecessary branches.

* LibCURL_jll: New version 8.15.0 (#59057)

Note that CURL 8.15.0 does not support using Secure Transport on MacOS
any more. This PR thus switches CURL to using OpenSSL on MacOS.

---------

Co-authored-by: Mosè Giordano <765740+giordano@users.noreply.github.com>

* Switch RISC-V to large model on LLVM 20 (#57865)

Co-authored-by: Tim Besard <tim.besard@gmail.com>

* Support complex numbers in eps (#21858)

This came up in
https://github.com/JuliaMath/IterativeSolvers.jl/pull/113#issuecomment-301273365
. JuliaDiffEq and IterativeSolvers.jl have to make sure that the
real-type is pulled out in order for `eps` to work:

```julia
eps(real(typeof(b)))
```

This detail can make many algorithms with tolerances that are written
generically that would otherwise work with complex numbers error. This
PR proposes to do just that trick, so that way `eps(1.0 + 1.0im)`
returns machine epsilon for a Float64 (and generally works for
`AbstractFloat` of course).

---------

Co-authored-by: Steven G. Johnson <stevenj@mit.edu>

* 🤖 [master] Bump the Pkg stdlib from 542ca0caf to d94f8a1d9 (#59093)

Co-authored-by: IanButterworth <1694067+IanButterworth@users.noreply.github.com>

* add array element mutex offset in print and gc (#58997)

The layout, printing, and gc logic need to correctly offset and align
the inset fields to account for the per-element mutex of an atomic
array with large elements.

Fix #58993

* Fix typo in tests introduced by #21858 (#59102)

That [2017 PR](https://github.com/JuliaLang/julia/pull/21858) used very
old types and had a semantic merge conflict.

* Fix msys symlink override rule (#59101)

The `export VAR=VAL` is syntax, so it can't be expanded. Fixes #59096

* inference: Make test indepdent of the `Complex` method table (#59105)

* Add uptime to CI test info (#59107)

* Fix rounding when converting Rational to BigFloat (#59063)

* make ReinterpretArray more Offset-safe (#58898)

* remove extraneous function included in #21858 (#59109)

Removes an apparently extraneous function accidentally included in
#21858, as noted in
https://github.com/JuliaLang/julia/pull/21858/files#r2233250284.

* [REPL] Handle empty completion, keywords better (#59045)

When the context is empty, (like "<TAB><TAB>"), return only names local
to the module (fixes #58931).

If the cursor is on something that "looks like" an identifier, like a
boolean or one of the keywords, treat it as if it was one for completion
purposes. Typing a keyword and hitting tab no longer returns the
completions for the empty input (fixes #58309, #58832).

* Add builtin function name to add methods error (#59112)

```
julia> Base.throw(x::Int) = 1
ERROR: cannot add methods to builtin function `throw`
Stacktrace:
 [1] top-level scope
   @ REPL[1]:1
```

* better error in juliac for defining main inside a new module (#59106)

This is more helpful if the script you try to compile defines a module
containing main instead of defining it at …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant