Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 661 Bytes

overview.rst

File metadata and controls

20 lines (14 loc) · 661 Bytes
.. currentmodule:: hepcrawl


What is HEPcrawl?

HEPcrawl is a Scrapy (http://scrapy.org) based crawler and acts as the service responsible for harvesting High-Energy Physics contents for INSPIRE-HEP. HEPcrawl is periodically triggered by INSPIRE to perform harvesting from sources and HEPcrawl then pushes JSON records back to INSPIRE ingestion workflows.