This repository was archived by the owner on May 17, 2024. It is now read-only.
Do not detect MD5s as UUIDs, and preserve UUID casing for UUID PKs #813
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Comparing MD5s as UUIDs does not work anyway: it improperly slices and then compares the values, since our code always renders UUIDs as
abcdabcd-abcd-abcd-abcd-abcdabcdabcd
, always dashed and lower-cased, while the actual value stored in MD5 (i.e. string) PKs can be uppercased and typically non-dashed (e.g.ABCDABCDABCDABCDABCDABCDABCDABCD
). As a result, all such MD5 PKs go into one pseudo-UUID range, usually the first one (because in ASCII & UTF-8, uppercase is lesser than lowercase letters).The root cause is that Python's UUID can parse even such values:
This PR excludes MD5s and other UUID-like textual PKs from UUID detection.
As an extra change (separate commits), this PR also preserves the information on how the database presents the UUIDs — either lowercased or uppercased, and renders the actual sliced UUID values accordingly. This does not matter for native UUIDs (stored & compared as numbers), but does matter for UUIDs stored and/or compared as strings (at least from one side of the diff).