Skip to content

ArXiv Miner Designed to scrape the pdf file from export_ArXiv sources.

Notifications You must be signed in to change notification settings

psreid/ArXiv_Miner

Repository files navigation

# ArXiv Miner is desinged to: Retrive, Save, and Query ArXiv PDFs
# Inlcuded in this packege is a csv file of all Hep-ex ID's
# However all of the ID's a re regularly maintained on Kaggle
# https://www.kaggle.com/Cornell-University/arxiv
# ------------

# Setup instructions for CEDAR
# ------------
# setup a virtual env with the following commands

# module load python/3.7
# virtualenv --no-download ARXENV
# source ARXENV/bin/activate

# Navigate to ArXiv_Miner source

# pip3 install numpy
# pip3 install pandas
# pip3 install pdftotext
# python3 setup.py install (not necessary command on cedar)
#


# Then you can run BDT_workflow script from job submissions

About

ArXiv Miner Designed to scrape the pdf file from export_ArXiv sources.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages