Description
Due date: yyyy-mm-dd
Assigned reviewers
- TBD
- TBD
Description
Context
Hello ! I'm a consumer of the OpenVerse API.
I've started building a free, open-source app that allows artists and art students to do timed drawing sessions for practice, using only Creative Common images.
I'm using the OpenVerse API (with a registered key) to search CC images and fetch their metadata, licence info and creator information for crediting them.
I wanted to provide some feedback on the /image_search endpoint, which is the one I'm currently working with.
The issue
The docs makes it clear that the API is not intended for in-depth search and will limit to return the first 10,000 results :
Although there may be millions of relevant records, only the most relevant several thousand records can be viewed. This is by design: the search endpoint should be used to find the top 10,000 most relevant results, not for exhaustive search or bulk download of every barely relevant result. As such, the caller should not try to access pages beyond page_count, or else the server will reject the query.
While more results would be more convenient for my usage, 10,000 still seems like a fair trade-off. Not too few, not too much.
However, What I expected reading this was to only receive 10,000 results max whatever my request is :
-
page_size=20
-> 500 pages page_size=100
-> 100 pagespage_size=500
-> 20 pages
But I quicky realized that the page_count
is capped to 20, apparently by design (see WordPress/openverse-api#859).
So, if I'm asking for a page_size of 20, I'll only get 400 browsable results.
Which means if I want to access the 10,000 results, I have to set a page_size=500
which is, well, not optimal :
- Heavy response payload, so slow request
- Lot of thumbnails to download, resulting in slow page loading times
Discussion
I'm having trouble understanding why the page_count
cap was necessary, and why consumers are not just allowed to browser freely through the first 10,000 records, setting page_size as they want ?
Is this something you would consider improving, or is the OpenVerse API not a good fit for what I'm trying to do ?
Thank you for your time,
looking forward to discuss this with you
Metadata
Assignees
Labels
Type
Projects
Status
📋 Backlog