-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make UUID columns unique in sqla #2323
Make UUID columns unique in sqla #2323
Conversation
@sphuber unexpectedly, the tests started passing after bringing the branch up to date with Let me know how best to proceed, feel free to push to my fork if you like. |
Let's wait with merging this. I would like to have a look next week to see that we properly generalize the fixes I originally developed for just |
f61dead
to
a8df59c
Compare
I have squashed and rebased @ltalirz initial work. On top of that I have added a commit that should address the bit that was missing: generalizing the deduplication code from working just on |
fdc5a66
to
9f723cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand correctly that this is fixing old migrations and not adding new ones, right?
from aiida.common.exceptions import IntegrityError | ||
|
||
query = text( | ||
'SELECT s.id, s.uuid FROM (SELECT *, COUNT(*) OVER(PARTITION BY uuid) AS c FROM {}) AS s WHERE c > 1'.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct, but a (possibly faster) way to get the information without subqueries and aggregate functions is just using group by, something like
SELECT uuid, COUNT(uuid) from {} GROUP BY uuid HAVING COUNT(uuid) > 1
(that gives a different result, but in the end you just use the result to know if there are duplicates, not which one they are).
Just for info anyway (and possibly you need the aggregate when actually deduplicating the uuid).
@giovannipizzi the latest status on this PR was (if I remember correctly) that we were waiting for you to merge in certain changes into |
@ltalirz I am taking care of it, this is still dependent on another PR that needs merging. |
9f723cf
to
ebf743e
Compare
This adds the uniqueness constraint on the UUID fields of the models: * Computer * Comment * Group * Node * Workflow A check is performed before the migration is applied, to see whether the affected tables actually contain any entries with duplicate UUIDs. In this case a warning is emitted and the migration is stopped. The infrastructure to deduplicate, that is present all ready for nodes, has not yet been generalized to arbitrary tables.
The original deduplication code was implemented only for the `DbNode` table but has now been extended to `DbComputer` and `DbGroup`. Note that this has not been done for `DbComment`, because it does not yet have an entity loader. If `load_comment` were to be implemented, support for deduplication can be added to the `deduplicate_uuids` function.
ebf743e
to
c034eb6
Compare
Fixes #2267
In django, all uuid columns are now unique. This does the same for sqlalchemy.