Mylot -> PDF Scraper

Inspired by this reddit post I wrote a scraper to save all of his mum's mylot posts to PDFs, including images and comments.

Apologies in advance for the messy code, I wrote it in about half an hour. Feel free to PR fix.

Running

Run in like so, supplying the username of the person your wish to archive.

node ./index.js ridingbet

How it works

First is paginates all the articles to get a complete list. Once that's done, using 5 instances of puppeteer (headless browser) it creates PDF copies of each article. There's a little bit of trickery to get it looking good (hiding elements/etc).

The PDFs are named with the ISO timestamp first, so they sort nicely in the folder from oldest to newest

Licence

MIT, do you want want :)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
img		img
.gitignore		.gitignore
README.md		README.md
index.js		index.js
package.json		package.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mylot -> PDF Scraper

Running

How it works

Licence

About

Releases

Packages

Languages

krishi-vb/mylot-article-scraper

Folders and files

Latest commit

History

Repository files navigation

Mylot -> PDF Scraper

Running

How it works

Licence

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages