Skip to content

Upper bound of 10000 queries means I can't access the entirety of INSPIRE institutions #20

Open
@smeehan12

Description

@smeehan12

I am trying to use the API to scrape the geographical distribution information for publications throughout the world to get a handle on the differences of publications by institutions located in different regions of the world. As such, I am trying to make calls to URLs like

https://inspirehep.net/api/institutions?sort=mostrecent&size=1&page=1

which allows me to query the metadata associated with the insitutional publication records.

This works well and allows me to get all the information I need. However, there seems to be an upper limit on being able to access all of the data because when I try a call like

https://inspirehep.net/api/institutions?sort=mostrecent&size=10&page=1001

I get a return of

{"status": 400, "message": "Maximum number of 10000 results have been reached."}

Now, I see that there is a maximum number of simultaneous returns that can be requested of 1000, but this upper bound of 10000 is causing issues because it means I can't access the data for the full set of 11791 institutions that have publications in HEP via this API.

Is there some reason why this upper bound exists? Or am I misusing the API?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions