Skip to content

Anonymization scanner tried to download zh_core_web_sm model even if language specified is only english #227

Open
@rajatm91

Description

@rajatm91

Describe the bug
Anonymization scanner tried to download zh_core_web_sm model even if language specified is only english

Config for the scanner :

  • type: Anonymize
    params:
    use_faker: false
    threshold: 0.75
    model_path: "./distilbert_finetuned_ai4privacy_v2"
    language: "en"

and I have installed en_core_web_sm model in my environment

Expected behavior
it shouldn't download new models

Additional context
I think the issue is at this line where its not using the language parameter pass to the class but using a global variable ALL_SUPPORTED_LANGUAGES which includes en and zh.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions