Skip to content

Incorporate Rekognition data into the catalog #431

Open

Description

Summary

Rekognition data in the form of object labels was collected for roughly 100m records in the Openverse catalog.

These labels should be sanitized for suitability in the Openverse project and applied to records in the Openverse Catalog as tags.

Description

Some exploratory work was done to assess the quality of these labels. The team generally felt positive about them, given we would blanket remove a subset of them (e.g. ones that assume a gender). We will need to do a broader analysis to determine if there are more labels we would want to exclude, and then incorporate them into the existing tags for each record in the catalog. The automated tags include a confidence score associated with the tag value, and we should also incorporate those values into the overall document score for relevant searches.

Best guess at list of implementation plans:

  • Strategy for filtering then upserting the tags into their associated records.
  • Determining whether/how to surface these tags in the frontend and differentiate them from provider-supplied tags

Documents

Issues

Milestone

Incorporate Rekognition Data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

🌟 goal: additionAddition of new feature💻 aspect: codeConcerns the software code in the repository🧭 project: threadAn issue used to track a project and its progress🧱 stack: catalogRelated to the catalog and Airflow DAGs

Type

No type

Projects

  • Status

    ⏸ On Hold

Relationships

None yet

Development

No branches or pull requests

Issue actions