Open
Description
openedon Jun 13, 2023
Currently, Presidio does not merge adjacent results having the same entity type. This causes the output to have multiple placeholders when replacing.
For example: My name is Dave Jones
could with some NER models produce My name is <PERSON> <PERSON>
.
It is better if the output would be My name is <PERSON>
, especially when replacing tokens with fake values or using OpenAI for fake data generation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment