Releases: norvig/paip-lisp
Scanned PDFs, now with tables of contents
@anrddh updated the scanned PDFs, adding comprehensive tables of contents:
- 4th edition, 1998
- higher resolution, better OCR, smaller file thanks to better compression
- previous release
- 6th edition, 2001
- now including metadata, like title and author
- previous release
For more details, see #166
Alpha drafts of ebooks from markdown
This is a rough draft! Issues are expected.
what's here
- epub - made by pandoc from markdown
- pdf - made by calibre from the epub version
features
- we can proofread and annotate on offline devices! 🎉
- includes the WIP effort to turn images to markdown tables
- includes the redrawn diagrams, as vector graphics
known issues
- chapter links in the Preface chapter don't work
- there are many more chapter links that have never been fixed up to work at all
- some tables have irregular column widths
- the TOC numbering is slightly out of sync, counting the frontmatter as chapter 1, About as chapter 2, Preface as chapter 3, Chapter 1 as chapter 4, etc.
epub
- the epub has the svg diagrams and formulae, not the png fallbacks
- Apple Books isn't showing code blocks, on desktop?
- an extra page break before section 12.1
- Bluefire appears to have some intermittent font glitches with italics
- some lines are split between pages, and some lines are truncated at the right
what's next
- pandoc worked for making a legible first draft, but using something else probably would help or sidestep the issues here. so, exploring alternatives is a good idea.
- these were made with default layouts for the software; customizing font sizes, margins, and the like should maybe wait
A better scan
About this copy
This is a scanned copy of the 4th printing, 1998. It's shared for reading, and for improving the Markdown copy in our Github repo.
How it was made
@pronoiac had the spine / binding removed and fed the pages through a scanner. Steps and software used:
- scanner gave 600dpi grayscale, as 3.6 gigabytes of png files
- Scantailor Advanced (in Docker): deskew the pages and render the pages as 300dpi black and white (1-bit) tiffs - 30 megabytes
- tiff2pdf and pdfunite: turn those many tiffs into one pdf
- OCRmyPDF: OCR with Tesseract, add title and author to the pdf, apply lossless JBIG2 compression - 24 megabytes
Other notes
- It’s higher resolution, though an older printing (4th printing, 1998) than the previous scan (6th printing, 2001).
- OCR is better than the previous scan - searching for keywords or phrases usually works
- why not the grayscale PNGs: space constraints on Github releases, and dubious value for space
- ebooks from the Markdown version are getting closer
- see #137 for some of the thoughts behind this release
Cleaned up Safari ebook
In July 2018, @eshellman worked on the epub version converted from Safari, fixing the cover and some html errors, and adding a .mobi version. This isn't generated from the markdown files, which should be cleaner.
Original sources
These are sources we're working from.
The original scanned PDF:
This is a copy of the original scan in one file. Readers can get one PDF instead of two, and the repo could be a little easier to work with.
An EPUB converted from O'Reilly Safari Books Online:
It's a much cleaner source to work from.
(The tar.gz and zip files are unintentional.)