Skip to content

abasu17/scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

scrapping

This project is developed on Python Platform. This is a PDF Scraping program.

  • Python Version : 3.6+

Setup Environment

Install PyPDF2

$ pip3 install PyPDF2

Install pdf2image

$ pip3 install pdf2image

Install tabula-py

$ pip3 install tabula-py

Setup Project

Clone GIT
Run Project

$ python3 scraping.py Absolute_PDF_File_Path Header_String OCR_Mode_On/Off

Example

$ python3 scraping.py "/home/myDesktop/ACC.pdf" "Management Discussion and Analysis" 0

  • Keep in mind : If OCR Mode is enable, it will take longer time.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages