Skip to content

Provider scripts may include html tags in record titles #1441

Open

Description

Description

Discovered in WordPress/openverse-catalog#614 with Wikimedia. The scripts ingests some records that include html tags in the record title.

Reproduction

  1. Run the Wikimedia provider script locally
  2. Inspect the ingested records, either through the db shell by running just db-shell or by accessing the generated tsv file through MinIO. You should see multiple records with tags in the title, like <div class='fn'> Silver penny of Henry I</div>

Additional context

It would be worth investigating to see how these records are appearing on the frontend.

We should consider fixing this in the MediaStore as it may apply to other providers than just Wikimedia.

Resolution

  • 🙋 I would be interested in resolving this bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    • Status

      📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions