Skip to content

Commit bb2ddc4

Browse files
committed
Added text for updating the duplicate variables
This adds a constraint to the table as well. Written up as a markdown.
1 parent c2c29b2 commit bb2ddc4

File tree

1 file changed

+79
-0
lines changed

1 file changed

+79
-0
lines changed
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Adding a constraint on `ndb.variables`
2+
3+
## Find Duplicated Records
4+
5+
There are a set of conditions where the `ndb.variables` has multiple `variableid` values for the same combination of `(taxonid, variableelementid, variableunitsid, variablecontextid)`. We need to sort these out first, and then clean it all up so we can add the constraint on the data in the table.
6+
7+
```sql
8+
with minid as (
9+
SELECT MIN(variableid) AS varid, taxonid, variableelementid, variableunitsid, variablecontextid FROM ndb.variables
10+
GROUP BY taxonid, variableelementid, variableunitsid, variablecontextid
11+
HAVING COUNT(variableid) > 1),
12+
vrs as (
13+
select vr.variableid, mn.varid from ndb.variables as vr
14+
inner join minid as mn on
15+
((mn.taxonid = vr.taxonid) or (mn.taxonid is null and vr.taxonid is null))
16+
and ((mn.variableelementid = vr.variableelementid) or (mn.variableelementid is null and vr.variableelementid is NULL))
17+
and ((mn.variableunitsid = vr.variableunitsid) or (mn.variableunitsid is null and vr.variableunitsid is NULL))
18+
and ((mn.variablecontextid = vr.variablecontextid) or (mn.variablecontextid is null AND vr.variablecontextid is NULL)))
19+
select * from vrs;
20+
```
21+
22+
With the results from this query we can see every `variableid` that is duplicated, and the minimum `variableid` that accomodates the record.
23+
24+
## Update FOREIGN KEY Tables
25+
26+
The `vrs` table then lets us replace the `variableid` values from tables that use `variableid` as a foreign key.
27+
28+
For example, we should be able to:
29+
30+
```sql
31+
with minid as (
32+
SELECT MIN(variableid) AS varid, taxonid, variableelementid, variableunitsid, variablecontextid FROM ndb.variables
33+
GROUP BY taxonid, variableelementid, variableunitsid, variablecontextid
34+
HAVING COUNT(variableid) > 1),
35+
vrs as (
36+
select vr.variableid, mn.varid from ndb.variables as vr
37+
inner join minid as mn on
38+
((mn.taxonid = vr.taxonid) or (mn.taxonid is null and vr.taxonid is null))
39+
and ((mn.variableelementid = vr.variableelementid) or (mn.variableelementid is null and vr.variableelementid is NULL))
40+
and ((mn.variableunitsid = vr.variableunitsid) or (mn.variableunitsid is null and vr.variableunitsid is NULL))
41+
and ((mn.variablecontextid = vr.variablecontextid) or (mn.variablecontextid is null AND vr.variablecontextid is NULL)))
42+
update ndb.isovariablescaletypes as dt
43+
set variableid = vrs.varid
44+
from vrs
45+
where vrs.variableid = dt.variableid;
46+
```
47+
48+
We can then repeat that for each table with the `variableid` as a foreign key. These tables include:
49+
50+
`ndb.data`; `ndb.datasetvariable`; `ndb.isoinstrumentation`; `ndb.isosrmetadata`; `ndb.isostandards`; `ndb.isovariablescaletypes`
51+
52+
## Delete the Duplicated Records
53+
54+
Once those tables have been updated, we can remove the duplicated variables:
55+
56+
```sql
57+
with minid as (
58+
SELECT MIN(variableid) AS varid, taxonid, variableelementid, variableunitsid, variablecontextid FROM ndb.variables
59+
GROUP BY taxonid, variableelementid, variableunitsid, variablecontextid
60+
HAVING COUNT(variableid) > 1),
61+
vrs as (
62+
select vr.variableid, mn.varid from ndb.variables as vr
63+
inner join minid as mn on
64+
((mn.taxonid = vr.taxonid) or (mn.taxonid is null and vr.taxonid is null))
65+
and ((mn.variableelementid = vr.variableelementid) or (mn.variableelementid is null and vr.variableelementid is NULL))
66+
and ((mn.variableunitsid = vr.variableunitsid) or (mn.variableunitsid is null and vr.variableunitsid is NULL))
67+
and ((mn.variablecontextid = vr.variablecontextid) or (mn.variablecontextid is null AND vr.variablecontextid is NULL)))
68+
DELETE FROM ndb.variables
69+
WHERE variableid = (SELECT variableid FROM vrs) AND NOT variableid = (SELECT varid FROM vrs)
70+
```
71+
72+
## Add the UNIQUE Constraint
73+
74+
Then we finish by updating the table to enforce a `UNIQUE` constraint on the table:
75+
76+
```sql
77+
ALTER TABLE ndb.variables ADD CONSTRAINT variables_unique UNIQUE (taxonid, variableelementid, variableunitsid, variablecontextid);
78+
```
79+

0 commit comments

Comments
 (0)