You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Design choice: A) Can be incorporated into existing tables as new columns OR B) set up additional table for hashed ids/real ids - hashed ids becomes key for join back to existing table (and only the anonymised data is stored in the existing table).
A) - probably least effort - data is all contained in one table. Drawback that it needs to be set to export clean or identifying data if required somewhere in the output stage
B) more robust solution as no identifying data would be stored in "working" table, data can only be re-identified by another operation - natively output de-identified data. Possibly less efficient.
Need to retain unique ids across multiple data sets. Look into hash.
Potentially add flag for anonymised data to pre-processing of json and csv.
needs to retain a value for both to ensure data set will work with toolkit.
Note: Check with Rio about how he handled it for Tim
The text was updated successfully, but these errors were encountered: