|
| 1 | +# Adding a constraint on `ndb.variables` |
| 2 | + |
| 3 | +## Find Duplicated Records |
| 4 | + |
| 5 | +There are a set of conditions where the `ndb.variables` has multiple `variableid` values for the same combination of `(taxonid, variableelementid, variableunitsid, variablecontextid)`. We need to sort these out first, and then clean it all up so we can add the constraint on the data in the table. |
| 6 | + |
| 7 | +```sql |
| 8 | +with minid as ( |
| 9 | + SELECT MIN(variableid) AS varid, taxonid, variableelementid, variableunitsid, variablecontextid FROM ndb.variables |
| 10 | + GROUP BY taxonid, variableelementid, variableunitsid, variablecontextid |
| 11 | + HAVING COUNT(variableid) > 1), |
| 12 | +vrs as ( |
| 13 | + select vr.variableid, mn.varid from ndb.variables as vr |
| 14 | + inner join minid as mn on |
| 15 | + ((mn.taxonid = vr.taxonid) or (mn.taxonid is null and vr.taxonid is null)) |
| 16 | + and ((mn.variableelementid = vr.variableelementid) or (mn.variableelementid is null and vr.variableelementid is NULL)) |
| 17 | + and ((mn.variableunitsid = vr.variableunitsid) or (mn.variableunitsid is null and vr.variableunitsid is NULL)) |
| 18 | + and ((mn.variablecontextid = vr.variablecontextid) or (mn.variablecontextid is null AND vr.variablecontextid is NULL))) |
| 19 | +select * from vrs; |
| 20 | +``` |
| 21 | + |
| 22 | +With the results from this query we can see every `variableid` that is duplicated, and the minimum `variableid` that accomodates the record. |
| 23 | + |
| 24 | +## Update FOREIGN KEY Tables |
| 25 | + |
| 26 | +The `vrs` table then lets us replace the `variableid` values from tables that use `variableid` as a foreign key. |
| 27 | + |
| 28 | +For example, we should be able to: |
| 29 | + |
| 30 | +```sql |
| 31 | +with minid as ( |
| 32 | + SELECT MIN(variableid) AS varid, taxonid, variableelementid, variableunitsid, variablecontextid FROM ndb.variables |
| 33 | + GROUP BY taxonid, variableelementid, variableunitsid, variablecontextid |
| 34 | + HAVING COUNT(variableid) > 1), |
| 35 | +vrs as ( |
| 36 | + select vr.variableid, mn.varid from ndb.variables as vr |
| 37 | + inner join minid as mn on |
| 38 | + ((mn.taxonid = vr.taxonid) or (mn.taxonid is null and vr.taxonid is null)) |
| 39 | + and ((mn.variableelementid = vr.variableelementid) or (mn.variableelementid is null and vr.variableelementid is NULL)) |
| 40 | + and ((mn.variableunitsid = vr.variableunitsid) or (mn.variableunitsid is null and vr.variableunitsid is NULL)) |
| 41 | + and ((mn.variablecontextid = vr.variablecontextid) or (mn.variablecontextid is null AND vr.variablecontextid is NULL))) |
| 42 | +update ndb.isovariablescaletypes as dt |
| 43 | +set variableid = vrs.varid |
| 44 | +from vrs |
| 45 | +where vrs.variableid = dt.variableid; |
| 46 | +``` |
| 47 | + |
| 48 | +We can then repeat that for each table with the `variableid` as a foreign key. These tables include: |
| 49 | + |
| 50 | +`ndb.data`; `ndb.datasetvariable`; `ndb.isoinstrumentation`; `ndb.isosrmetadata`; `ndb.isostandards`; `ndb.isovariablescaletypes` |
| 51 | + |
| 52 | +## Delete the Duplicated Records |
| 53 | + |
| 54 | +Once those tables have been updated, we can remove the duplicated variables: |
| 55 | + |
| 56 | +```sql |
| 57 | +with minid as ( |
| 58 | + SELECT MIN(variableid) AS varid, taxonid, variableelementid, variableunitsid, variablecontextid FROM ndb.variables |
| 59 | + GROUP BY taxonid, variableelementid, variableunitsid, variablecontextid |
| 60 | + HAVING COUNT(variableid) > 1), |
| 61 | +vrs as ( |
| 62 | + select vr.variableid, mn.varid from ndb.variables as vr |
| 63 | + inner join minid as mn on |
| 64 | + ((mn.taxonid = vr.taxonid) or (mn.taxonid is null and vr.taxonid is null)) |
| 65 | + and ((mn.variableelementid = vr.variableelementid) or (mn.variableelementid is null and vr.variableelementid is NULL)) |
| 66 | + and ((mn.variableunitsid = vr.variableunitsid) or (mn.variableunitsid is null and vr.variableunitsid is NULL)) |
| 67 | + and ((mn.variablecontextid = vr.variablecontextid) or (mn.variablecontextid is null AND vr.variablecontextid is NULL))) |
| 68 | +DELETE FROM ndb.variables |
| 69 | +WHERE variableid = (SELECT variableid FROM vrs) AND NOT variableid = (SELECT varid FROM vrs) |
| 70 | +``` |
| 71 | + |
| 72 | +## Add the UNIQUE Constraint |
| 73 | + |
| 74 | +Then we finish by updating the table to enforce a `UNIQUE` constraint on the table: |
| 75 | + |
| 76 | +```sql |
| 77 | +ALTER TABLE ndb.variables ADD CONSTRAINT variables_unique UNIQUE (taxonid, variableelementid, variableunitsid, variablecontextid); |
| 78 | +``` |
| 79 | + |
0 commit comments