Skip to content

Conversation

@janbnz
Copy link

@janbnz janbnz commented Nov 23, 2025

Closes #14117

Added support for pseudonymized groups by recursively anonymizing all group names in the group tree. The implementation assigns groups-<id> identifiers and integrates the original names into the value-mapping output. This closes the gap where only entries where pseudonymized but groups where not.

Currently, only the group names themselves are pseudonymized (groups-1, groups-2, etc.), but the filters inside the groups (e.g., readstatus = read) are not updated to match the pseudonymized field values (readstatus-1, readstatus-2).

  • Should we also pseudonymize the filter values so that the GUI group counts remain correct after pseudonymization?
  • Or is it acceptable for the pseudonymized .bib to break GUI filters and counts, given that the main goal is anonymization?

Feedback is welcome.

Steps to test

  1. Use JabKit CLI to pseudonymize library (like here Add Pseudonymization to CLI #13158)
  2. Open the generated pseudonymized .bib file and check that all groups now have names like groups-1, groups-2, etc.
  3. Verify that the CSV mapping contains the original group names corresponding to the new pseudonymized identifiers.
video.mov

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • I manually tested my changes in running JabRef (always required)
  • I added JUnit tests for changes (if applicable)
  • I added screenshots in the PR description (if change is visible to the user)
  • I described the change in CHANGELOG.md in a way that is understandable for the average user (if change is visible to the user)
  • [/] I checked the user documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request updating file(s) in https://github.com/JabRef/user-documentation/tree/main/en.

@github-actions
Copy link
Contributor

Hey @janbnz!

Thank you for contributing to JabRef! Your help is truly appreciated ❤️

We have automated checks in place, based on which you will soon get feedback if any of them are failing. In a while, maintainers will also review your contribution. Once that happens, you can go through their comments in the "Files changed" tab and act on them, or reply to the conversation if you have further inputs.

Please re-check our contribution guide in case of any other doubts related to our contribution workflow.

@koppor
Copy link
Member

koppor commented Nov 23, 2025

Currently, only the group names themselves are pseudonymized (groups-1, groups-2, etc.), but the filters inside the groups (e.g., readstatus = read) are not updated to match the pseudonymized field values (readstatus-1, readstatus-2).

You are on the right track. But the details seem to be wrong. readstatus is a special field and not a group.

Entries take their groups in the groups field. In case the group before was abc and now group-1, it should be changed in the entry, too. Please also add a test.

I am talking about "Explicit selection" here. There are other types of groups, but I think, they do not need to be tackled here as the pseudonymization is affecting other areas (e.g., "renaming" of keywords of an entry)

@koppor koppor added the status: changes-required Pull requests that are not yet complete label Nov 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first contrib status: changes-required Pull requests that are not yet complete

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pseudonymize groups

2 participants