Skip to content

Conversation

@petvana
Copy link
Member

@petvana petvana commented Nov 4, 2023

Minor optimization to compute index in Dict only once.

This PR should not be merged before #52017.

Master

  126.417 μs (1 allocation: 16 bytes)
  147.812 μs (1 allocation: 16 bytes)

PR

  86.494 μs (1 allocation: 16 bytes)
  156.912 μs (1 allocation: 16 bytes)
Testing code
using BenchmarkTools

function PR_pop!(s::Set, x, default)
    dict = s.dict
    index = Base.ht_keyindex(dict, x)
    if index > 0
        @inbounds key = dict.keys[index]
        Base._delete!(dict, index)
        return key
    else
        return default
    end
end

N = 10000
x = collect(1:N)
x_negative = collect(-N:-1)

function pop_all(s, x)
    for v in x
        pop!(s, v, -1)
    end
end

function pop_all_PR(s, x)
    for v in x
        PR_pop!(s, v, -1)
    end
end

# Master
@btime pop_all(s, x) setup=(s=Set(x))
@btime pop_all(s, x_negative) setup=(s=Set(x))

# PR
@btime pop_all_PR(s, x) setup=(s=Set(x))
@btime pop_all_PR(s, x_negative) setup=(s=Set(x))

@petvana petvana added collections Data structures holding multiple items, e.g. sets performance Must go faster labels Nov 4, 2023
@petvana petvana marked this pull request as draft November 4, 2023 10:37
@jakobnissen
Copy link
Member

I'm curious why it is slower for x_negative?
Anyway, your benchmark is the best possible case for the implementation on master - if hashing is more expensive, this PRs implementation will pull ahead.

@petvana
Copy link
Member Author

petvana commented Nov 4, 2023

Yes, the speedup is about 30% here, but should be close to 50% for expensive hashing function.

The slowdown for x_negative (returning default value) is almost negligible and I guess it depends on specific clang optimizations.

@petvana petvana marked this pull request as ready for review November 9, 2023 06:24
@petvana
Copy link
Member Author

petvana commented Nov 10, 2023

Once PR passes CI, it seems ready.

@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Nov 10, 2023
@oscardssmith oscardssmith merged commit d88d5cd into JuliaLang:master Nov 10, 2023
@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

collections Data structures holding multiple items, e.g. sets performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants