This shell script downloads the HTML version of books you have access to from Cambridge Core.
The Core website lets you download PDFs of some books you have access to, but there is no method to donwnload the HTML. I want the HTML because I can reflow the text and change the font size, and use it with text-to-speech programs.
Some books don't have HTML versions that work with this script. If it says 'View full HTML' on the page, it will work. If it says 'Online view' it will not work. If only PDFs are available, it will not work.
Login to Cambridge Core. Use e.g. cookies.txt Chrome extension to create a cookies.txt file that this script will use. The cookies.txt file needs to be in the same directory as
Make executable: chmod +x
You will need to have pup installed.
The script takes the URL to the book's contents page, e.g. ./
It will download all parts of the book in the order in which they are displayed in the contents page. The output is a single HTML file. If you have Pandoc installed it will use the HTML to create an ePub.
Tested on MacOS.