Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functionality - Option to anonymise author_id and display name #50

Open
deimosnz opened this issue Sep 14, 2022 · 1 comment
Open

Functionality - Option to anonymise author_id and display name #50

deimosnz opened this issue Sep 14, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@deimosnz
Copy link

Potentially add flag for anonymised data to pre-processing of json and csv.

needs to retain a value for both to ensure data set will work with toolkit.

Note: Check with Rio about how he handled it for Tim

@deimosnz deimosnz added the enhancement New feature or request label Sep 14, 2022
@deimosnz deimosnz self-assigned this Sep 14, 2022
@deimosnz
Copy link
Author

Design choice: A) Can be incorporated into existing tables as new columns OR B) set up additional table for hashed ids/real ids - hashed ids becomes key for join back to existing table (and only the anonymised data is stored in the existing table).

A) - probably least effort - data is all contained in one table. Drawback that it needs to be set to export clean or identifying data if required somewhere in the output stage

B) more robust solution as no identifying data would be stored in "working" table, data can only be re-identified by another operation - natively output de-identified data. Possibly less efficient.

Need to retain unique ids across multiple data sets. Look into hash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant