Skip to content

[WIP][Sema][stdlib] Emit a deprecation warning when a Hashable type only implements hashValue #20685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

lorentey
Copy link
Member

@lorentey lorentey commented Nov 20, 2018

SE-206 deprecated hashValue as a Hashable requirement, and in Swift 4.2, the stdlib's hashing collections have all migrated to using hash(into:) as the primary hashing interface.

Types that only implement hashValue often take improper shortcuts by omitting components due to the difficulty of combining hash values. This leads to hash collisions that can break the performance of Set and Dictionary.

  • Emit a deprecation warning when a type conforms to Hashable by providing an explicit hashValue definition without also providing hash(into:).
  • Fix overlay issues uncovered by the new warning, improving hashing where necessary. Leave the old hashValue implementations in place for now. (They should be removed soon.)

This is unlikely to land as is: the overlay changes are interesting on their own, and they will probably be spun off into separate PR(s).

rdar://problem/39838478

…into:) isn’t

SE-206 deprecated hashValue as a protocol requirement. We should gently encourage people to migrate to hash(into:), for its more secure, easier and faster hashing.

Emit a compiler warning whenever hashValue has an explicit implementation, but hash(into:) doesn’t.
This replaces the previously compiler-synthesized implementation with one that directly feeds the selector pointer to the hasher. The new definition is functionally equivalent, but warning-free and faster.
This replaces the previously compiler-synthesized implementation with one that directly feeds the wrapped value to the hasher. The new definition is functionally equivalent, but warning-free and faster.
@lorentey
Copy link
Member Author

This will fail testing for now. I intend to limit the warning to Swift 5 mode soon, which will hide most of the failures. However, let's first see how much effort it would be to migrate the tests over.

@lorentey
Copy link
Member Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - f2e431f61b616882506a23280ac8c90d4d67e447

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - f2e431f61b616882506a23280ac8c90d4d67e447

…sh(into:)

SE-0206 deprecated hashValue as a Hashable requirement; the primary hashing interface is now hash(into:). We should migrate existing Hashable conformances to implement the new interface.

The manual hash(into:) definitions in this batch of changes are written to prevent nested Hasher sessions, but otherwise they do the same thing as the compiler-synthesized variants — typically forwarding the actual job of hashing to an underlying [Core]Foundation type.

The existing hashValue definitions are currently left intact. They should be removed later (i.e., replaced by compiler-synthesized default implementation), so that hashValue gets defined in terms of hash(into:), with random seeding etc.
The explicit implementation diverges from the original hashValue; it properly feeds all components to the hasher.
…ll bytes

The hash(into:) definition diverges from hashValue in that it properly hashes *all* bytes contained in the instance, not just an arbitrary prefix.
The new definition does no attempt to replicate hashValue’s #availability check.
(Swift uses randomly seeded hashing, so code must not rely on hashValue returning any specific integer value.)
The new definition simply feeds both components to the supplied hasher.

This is functionally equivalent to hashValue’s original use of CFHashBytes, although it will generate different hashes.
The new definition is functionally equivalent to the original hashValue implementation, although it produces different hashes.

Add FIXME about the hash encoding not being unique.
…*all* components

The new definition diverges from the existing hashValue: it feeds all path components to the hasher, rather than just a limited subset.

Hashing all the bits that are compared in == is necessary to ensure proper hashing.
The new definition is functionally equivalent to hashValue, although it produces different hash values.
The new definition is functionally equivalent to hashValue, although it produces different hash values.
The new definition normalizes measurements with dimensional units to their base unit, ensuring that the generated hash values are consistent with the definition of ==. It also feeds the unit to the hasher, not just the numerical value.

The old hashValue definition was invalid, and has been removed. (I.e., replaced by a compiler-supplied implementation that works in terms of hash(into:).)

rdar://problem/46187171
The new definition is the same as the one synthesized by the compiler.
@lorentey lorentey force-pushed the deprecate-hashvalue-for-real branch from f2e431f to 9bc479a Compare November 21, 2018 18:48
@lorentey
Copy link
Member Author

@swift-ci please test

@natecook1000
Copy link
Member

@lorentey Did everything here get landed? I see various parts in the referenced PRs.

@CodaFi
Copy link
Contributor

CodaFi commented Nov 18, 2019

Seems like we got them all. If I'm wrong, please feel free to reopen this.

@CodaFi CodaFi closed this Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants