Skip to content

Conversation

jipolanco
Copy link
Contributor

This avoids a small allocation (and improves performance) when iterating over a ProductIterator constructed using mixed iterator types. See #40283 for details.

I couldn't really identify where the allocation comes from, and there's likely an underlying issue to be still fixed. But this does solve the ProductIterator issue.

When used with homogeneous iterator types, this has no additional cost compared to the original short-circuiting version, as seen in the example below.

Also note that Base.map (used in this PR) was already being used in isdone(z::Zip).

Closes #40283.

Example:

using BenchmarkTools

# 1. Mixing different iterator types (heterogeneous iterator)
xs = (1:5, 1:0.5:4)
iter = Iterators.product(xs...)

# Julia master:
@btime iterate($iter);  # 37.649 ns (2 allocations: 144 bytes)

# With this PR:
@btime iterate($iter);  # 8.586 ns (0 allocations: 0 bytes)

# 2. Using the same iterator type (homogeneous iterator)
xs = (1:5, 2:4)
iter = Iterators.product(xs...)

# Julia master:
@btime iterate($iter);  # 2.181 ns (0 allocations: 0 bytes)

# With this PR:
@btime iterate($iter);  # 2.124 ns (0 allocations: 0 bytes)

@N5N3
Copy link
Member

N5N3 commented Feb 7, 2022

any(f, iter::Tuple) use loop-based fallback thus any mixed types might do harm to performance.
Maybe we should unroll the loop for short iter (similar to Base._tuple_any)

@jipolanco
Copy link
Contributor Author

Seems reasonable to add an optimised version of any (and also all?) for tuples.

Here is one possible implementation:

# any(f, itr) = _any(f, itr, :)  # definition in base/reduce.jl

_any(f, itr::Tuple, ::Colon) = _any_tuple(f, false, itr...)

@inline function _any_tuple(f, anymissing, x, rest...)
    v = f(x)
    if ismissing(v)
        anymissing = true
    elseif v
        return true
    end
    return _any_tuple(f, anymissing, rest...)
end

@inline _any_tuple(f, anymissing) = anymissing ? missing : false

Should this be limited to small tuples?

@jipolanco jipolanco force-pushed the jip/product-iterator branch from 0cbdce3 to d329e32 Compare February 8, 2022 10:11
@jipolanco jipolanco changed the title Avoid allocation in iterate(::ProductIterator) Specialise any and all for tuples Feb 8, 2022
@N5N3
Copy link
Member

N5N3 commented Feb 9, 2022

I think it's OK to put these optimization in reduce.jl to fix the build error.
Another suggestion: if eltype(x::Tuple) is concrete, it seems good to use for loop and Let LLVM unroll for us.

# If eltype(x) is concrete. Use for loop
any(f, x::NTuple{N}) where {N} = _any(f, x, :)
# Otherwise force unroll to avoid union-split. (Only for small Tuple)
function any(f, x::NTuple{N,Any}) where {N}
    N >= 32 && return _any(f, x, :) # Not tuned, just follow Any32
    _any_tuple(f, false, x...)
end

@jipolanco jipolanco force-pushed the jip/product-iterator branch from f1671b1 to e8c25b4 Compare February 9, 2022 08:03
@jipolanco
Copy link
Contributor Author

jipolanco commented Feb 9, 2022

@N5N3 Thanks for the suggestions! This should be taken care of now.

Fall back to original for-loop implementation if the tuple either:

- is a homogeneous tuple (all elements have the same type);
- has length > 32.
@KristofferC
Copy link
Member

@jipolanco Is this good to go?

@jipolanco
Copy link
Contributor Author

@KristofferC Yes I think it's ready to be merged.

@KristofferC KristofferC merged commit d5b95c7 into JuliaLang:master May 9, 2022
@jipolanco jipolanco deleted the jip/product-iterator branch May 9, 2022 09:54
@nsajko nsajko added performance Must go faster collections Data structures holding multiple items, e.g. sets labels Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allocation in iterating over a ProductIterator constructed using mixed iterator types
5 participants