Open
Description
openedon Apr 21, 2021
This issue has been migrated from the CC Search Catalog repository
Author: annatuma
Date: Fri Jun 12 2020
Labels: providers,β¨ goal: improvement,π
status: discontinued
Prerequisites to this:
- Support for video in CC Search
- Integration of CC license selection on upload for Cinnamon users
We do not expect to be ready to further investigate this integration until some time in 2021
Provider API Endpoint / Documentation
Provider description
Licenses Provided
Provider API Technical info
Checklist to complete before beginning development
No development should be done on a Provider API Script until the following info is gathered:
- Verify there is a way to retrieve the entire relevant portion of the provider's collection in a systematic way via their API.
- Verify the API provides license info (license type and version; license URL provides both, and is preferred)
- Verify the API provides stable direct links to individual works.
- Verify the API provides a stable landing page URL to individual works.
- Note other info the API provides, such as thumbnails, dimensions, attribution info (required if non-CC0 licenses will be kept), title, description, other meta data, tags, etc.
- Attach example responses to API queries that have the relevant info.
General Recommendations for implementation
- The script should be in the
src/cc_catalog_airflow/dags/provider_api_scripts/
directory. - The script should have a test suite in the same directory.
- The script must use the
ImageStore
class (Import this from
src/cc_catalog_airflow/dags/provider_api_scripts/common/storage/image.py
). - The script should use the
DelayedRequester
class (Import this from
src/cc_catalog_airflow/dags/provider_api_scripts/common/requester.py
). - The script must not use anything from
src/cc_catalog_airflow/dags/provider_api_scripts/modules/etlMods.py
, since
that module is deprecated. - If the provider API has can be queried by 'upload date' or something similar,
the script should take a--date
parameter when run as a script, giving the
date for which we should collect images. The form should beYYYY-MM-DD
(so,
the script can be run viapython my_favorite_provider.py --date 2018-01-01
). - The script must provide a main function that takes the same parameters as from
the CLI. In our example from above, we'd then have a main function
my_favorite_provider.main(date)
. The main should do the same thing calling
from the CLI would do. - The script must conform to PEP8. Please use
pycodestyle
(available via
pip install pycodestyle
) to check for compliance. - The script should use small, testable functions.
- The test suite for the script may break PEP8 rules regarding long lines where
appropriate (e.g., long strings for testing).
Examples of other Provider API Scripts
For example Provider API Scripts and accompanying test suites, please see
src/cc_catalog_airflow/dags/provider_api_scripts/flickr.py
andsrc/cc_catalog_airflow/dags/provider_api_scripts/test_flickr.py
, orsrc/cc_catalog_airflow/dags/provider_api_scripts/wikimedia_commons.py
andsrc/cc_catalog_airflow/dags/provider_api_scripts/test_wikimedia_commons.py
.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Assignees
Labels
Type
Projects
Status
π Backlog