Convert a PDF paper to html page.
You can translate the paper easily by browser functions, and you can view the original document and the translated document at the same time.
Albanie, Samuel, Sébastien Ehrhardt, and Joao F. Henriques. "Stopping gan violence: Generative unadversarial networks." arXiv preprint arXiv:1703.02528 (2017).
If you want to convert papers more accurately, you can also use a good experimental service by Allen Institute for AI.
- Convert PDF files on the Internet easily by using a bookmarklet.
- Support for double-column papers.
$ docker run --rm -it -p 6003:6003 ghcr.io/ktaaaki/paper2html
Use with care as it opens up the port.
$ sudo apt install poppler-utils poppler-data
$ git clone https://github.com/ktaaaki/paper2html.git
$ pip install -e paper2html
$ python3 ./paper2html/main.py
$ brew install poppler
$ git clone https://github.com/ktaaaki/paper2html.git
$ pip install -e paper2html
$ python3 ./paper2html/main.py
Download Poppler for Windows
binary file from http://blog.alivate.com.au/poppler-windows/
Please set the Poppler for Windows
path(ex.C:\Users\YOUR_NAME\Downloads\poppler-0.68.0\bin
) in the PATH environment variable.
Verify that the path is displayed with the following command.
> where.exe pdfinfo
Download the zip file or use git clone
command to save the paper2html code locally, and then install it using the following command.
> py -m pip -e paper2html
> python .\paper2html\main.py
Upload a PDF file to the server by using this bookmarklet.
javascript:var esc=encodeURIComponent;var d=document;var subw=window.open('http://localhost:6003/paper2html/convert?url='+esc(location.href)).document;
Click on the bookmarklet when you open a PDF paper in your browser.
Then the conversion will start and the generated html will be opened after a while.
You can see the list of converted documents in the index page localhost:6003/paper2html/index.html
NOTE👉 If you are running a paper2html server on Docker, you will not be able to convert PDF file on the host OS with the bookmarklet. See docker image doc.
Run this command, then open the html file in your browser.
$ python paper2html/commands.py "path-to-paper-file.pdf"
In IPython, do it like this.
>>> import paper2html
>>> paper2html.open_paper_htmls("path-to-paper-file.pdf")
You can use specific browser.
$ python paper2html/commands.py "path-to-paper-file.pdf" --browser_path="/path/to/browser"
You can also only convert without opening a browser.
>>> import paper2html
>>> paper2html.paper2html("path-to-paper-file or directory")