Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize qubit hash for Set operations #6908

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

daxfohl
Copy link
Collaborator

@daxfohl daxfohl commented Jan 1, 2025

Change the hash function from tuple, to manually multiplying each term by 1_000_003, which is also the term multiplier Python uses internally for strings and complex ints. This hashes at the same speed as the tuple, but maintains a linear relationship with each term, which reduces the number of bucket collisions in the hash tables underlying Sets and Dicts for line and grid qubits. Improves amortized Set operations perf such as the below by around 50%.

s = set()
for q in cirq.GridQubit.square(100):
    s = s.union({q})

Fixes #6886

Improves amortized `Set` operations perf by around 50%, though with the caveat that sets with qudits of different dimensions but the same index will always have the same key (not just the same bucket), and thus have to check `__eq__`, causing degenerate perf impact. It seems unlikely that anyone would intentionally do this though.

```python
s = set()
for q in cirq.GridQubit.square(100):
    s = s.union({q})
```
@daxfohl daxfohl requested review from vtomole and a team as code owners January 1, 2025 19:38
@daxfohl daxfohl requested a review from mhucka January 1, 2025 19:38
@CirqBot CirqBot added the size: S 10< lines changed <50 label Jan 1, 2025
Copy link

codecov bot commented Jan 2, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.87%. Comparing base (5d317ba) to head (57468b5).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6908   +/-   ##
=======================================
  Coverage   97.87%   97.87%           
=======================================
  Files        1084     1084           
  Lines       94406    94408    +2     
=======================================
+ Hits        92396    92398    +2     
  Misses       2010     2010           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 41 to 42
# This approach seems to perform better than traditional "random" hash in `Set`
# operations for typical circuits, as it reduces bucket collisions. Caveat: it does not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you evaluate this reduction in bucket collisions? Would be good to show this explicitly before we decide to abandon the standard tuple hash.

Copy link
Collaborator Author

@daxfohl daxfohl Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test code is up in the description. It's about 50% faster with this implementation.

One note is that it seems like it's only faster for copy-on-change ops like s = s.union({q}). It doesn't seem to have any effect when we operate on sets mutably like s |= {q}. But given most of our stuff is immutable, we see a lot more of the former in our codebase.

cirq-core/cirq/devices/grid_qubit.py Outdated Show resolved Hide resolved
@daxfohl daxfohl marked this pull request as draft January 6, 2025 17:28
@daxfohl daxfohl marked this pull request as ready for review January 12, 2025 06:43
@daxfohl daxfohl requested a review from maffoo January 12, 2025 06:59
@pavoljuhas
Copy link
Collaborator

I can see also an improvement for a set construction from grid qubits; there is no significant difference for a set update in place.
Note I used [hash(q) for q in cirq.GridQubit.square(100)] so that the hash values would be cached upfront.

sq = cirq.GridQubit.square(100)
[hash(q) for q in sq]

# set from ordered qubits
%timeit set(sq)
#
# OLD: 1.54 ms ± 58.9 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
# NEW: 1.32 ms ± 75.5 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

# set update in place
%%timeit s = set()
for q in sq:
    s.add(q)
#
# OLD: 1.43 ms ± 15.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
# NEW: 1.44 ms ± 60.6 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

# new sets created via union
%%timeit s = set()
for q in sq:
    s = s.union({q})
#
# OLD: 820 ms ± 3.25 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# NEW: 347 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

@daxfohl daxfohl added this pull request to the merge queue Jan 21, 2025
@daxfohl daxfohl removed this pull request from the merge queue due to a manual request Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: S 10< lines changed <50
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make Line and Grid Qubit hashes faster for common set ops
4 participants