Skip to content

Add meta_data cleaning steps to MediaStore #1406

Open

Description

Description

The meta_data field is a JSONB field in the database intended for holding extra information we receive from the API. Presently, we have several unique ways of cleaning this data in various providers:

https://github.com/WordPress/openverse-catalog/blob/6e9d02d65ef42b92bcbc63d7cb1695d13318517d/openverse_catalog/dags/providers/provider_api_scripts/cleveland_museum.py#L100-L111

https://github.com/WordPress/openverse-catalog/blob/9be8bcec541f95dcd38d76a1064ce72cd75f8255/openverse_catalog/dags/providers/provider_api_scripts/brooklyn_museum.py#L76-L85

https://github.com/WordPress/openverse-catalog/blob/c1b970b11dbf63a45a8c2d6a33d6058a662b7ac9/openverse_catalog/dags/providers/provider_api_scripts/stocksnap.py#L160-L170

https://github.com/WordPress/openverse-catalog/blob/323d07bc0786fe4593df1b77fa890ae7f37f5668/openverse_catalog/dags/providers/provider_api_scripts/smk.py#L132-L145

All of these tend to operate in a similar manner: pull out fields from the response and add them to the meta_data dictionary only if they're not None.

It would be nice to move the None-filtering logic to the MediaStore class's validation steps so that we can be primarily concerned about which fields to pull rather than the mechanism by which we don't add (or remove after) None values.

Additional context

Implementation

  • 🙋 I would be interested in implementing this feature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    • Status

      📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions