[User Feedback] Searches page depth

## Due date: yyyy-mm-dd


## Assigned reviewers
- [ ] TBD
- [ ] TBD


## Description

### Context
Hello ! I'm a consumer of the OpenVerse API.
I've started building a free, open-source app that allows artists and art students to do timed drawing sessions for practice, using only Creative Common images.

I'm using the OpenVerse API (with a registered key) to search CC images and fetch their metadata, licence info and creator information for crediting them.

I wanted to provide some feedback on the [/image_search](https://api.openverse.engineering/v1/#operation/image_search) endpoint, which is the one I'm currently working with.

### The issue
The docs makes it clear that the API is not intended for in-depth search and will limit to return the first 10,000 results : 

> Although there may be millions of relevant records, only the most relevant several thousand records can be viewed. This is by design: the search endpoint should be used to find the top 10,000 most relevant results, not for exhaustive search or bulk download of every barely relevant result. As such, the caller should not try to access pages beyond page_count, or else the server will reject the query.

While more results would be more convenient for my usage, 10,000 still seems like a fair trade-off. Not too few, not too much.

However, What I expected reading this was to only receive 10,000 results max whatever my request is : 
-  `page_size=20` -> 500 pages
-  `page_size=100` -> 100 pages
-  `page_size=500` -> 20 pages

But I quicky realized that the `page_count` is capped to 20, apparently by design (see https://github.com/WordPress/openverse-api/pull/859).
So, if I'm asking for a page_size of 20, I'll only get 400 browsable results.

Which means if I want to access the 10,000 results, I have to set a `page_size=500` which is, well, not optimal : 
- Heavy response payload, so slow request
- Lot of thumbnails to download, resulting in slow page loading times

### Discussion

I'm having trouble understanding why the `page_count` cap was necessary, and why consumers are not just allowed to browser freely through the first 10,000 records, setting page_size as they want ?

Is this something you would consider improving, or is the OpenVerse API not a good fit for what I'm trying to do ?

Thank you for your time,
looking forward to discuss this with you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[User Feedback] Searches page depth #668

clementoriol
openedon Sep 26, 2022

Due date: yyyy-mm-dd

Assigned reviewers

Description

Context

The issue

Discussion

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[User Feedback] Searches page depth #668

Description

clementoriolopenedon Sep 26, 2022

Due date: yyyy-mm-dd

Assigned reviewers

Description

Context

The issue

Discussion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

clementoriol
openedon Sep 26, 2022