Skip to content

Merge two entities from the same type with whitespace between them #1090

Open

Description

Currently, Presidio does not merge adjacent results having the same entity type. This causes the output to have multiple placeholders when replacing.

For example: My name is Dave Jones could with some NER models produce My name is <PERSON> <PERSON>.
It is better if the output would be My name is <PERSON>, especially when replacing tokens with fake values or using OpenAI for fake data generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions