Skip to content

Automatically Export Book

Asterios Raptis edited this page Aug 25, 2025 · 21 revisions

Full Export Book Script Documentation

πŸ“š Overview

The full_export_book.py script automates the export of a book into multiple formats (Markdown, PDF, EPUB, DOCX) using Pandoc.

✨ Features

  • πŸ‘‰ Converts relative image paths to absolute paths before export (optional)
  • πŸ‘‰ Handles both Markdown images (![alt](path)) and HTML <img> / <figure> tags
  • πŸ‘‰ Exports book content into multiple formats using Pandoc
  • πŸ‘‰ Converts absolute paths back to relative paths after export (optional)
  • πŸ‘‰ Supports custom arguments for flexible execution
  • πŸ‘‰ Optional EPUB cover image support via --cover parameter
  • πŸ‘‰ Auto-detects language from metadata.yaml (or override with --lang)
  • πŸ‘‰ Poetry integration: Run via poetry run full-export

❓ Why Convert Paths to Absolute?

πŸ” The Problem

Pandoc does not always resolve relative paths correctly, especially when exporting to:

  • PDF (via LaTeX)
  • EPUB (due to internal resource handling)
  • DOCX (for embedded images)

Example problematic image reference:

![Figure 1](../../assets/figures/diagram.png)

This may result in broken references or missing images.

βœ… The Solution

Before export, the script automatically converts all image paths to absolute paths:

![Figure 1](/absolute/path/to/assets/figures/diagram.png)

This ensures:

  • No missing images in PDF, EPUB, and DOCX
  • Platform-independent behavior (Windows, macOS, Linux)
  • Correct image embedding across formats

After export, the script restores relative paths to keep the Markdown clean.


πŸš€ Installation & Requirements

1️⃣ Install Pandoc

Ensure Pandoc is installed:
πŸ”— https://pandoc.org/installing.html

pandoc --version

2️⃣ Install Python & Poetry

Ensure Python 3.13+ and Poetry are installed:

python3 --version
poetry --version

If Poetry is missing:

pip install poetry

3️⃣ Install Dependencies

Run:

poetry install

πŸ›  How to Use

1️⃣ Default Export (All Formats)

poetry run full-export

This will:

  • Convert images to absolute paths
  • Compile the book into Markdown, PDF, EPUB, and DOCX
  • Restore relative paths after export

2️⃣ Export Specific Formats

Specify formats using --format (comma-separated):

Available formats:

  • markdown (GitHub Flavored Markdown)
  • pdf
  • epub
  • docx

Example: Export only PDF and EPUB

poetry run full-export --format pdf,epub

3️⃣ Add a Cover to EPUB

Use the --cover option to specify a cover image for the EPUB:

poetry run full-export --format=epub --cover=assets/covers/cover.jpg

πŸ“Œ Notes:

  • Only applies to EPUB export

  • If used without --format=epub, it will be ignored

  • Supported formats: .jpg, .jpeg, .png


4️⃣ Skip Image Processing

If images are already correctly linked, you can skip all image conversion steps:

poetry run full-export --skip-images

πŸš€ Skips both path rewriting and <img> tag transformations.


5️⃣ Keep Relative Paths (NEW)

If you are using <figure> tags (or otherwise want to preserve relative paths), use:

poetry run full-export --keep-relative-paths

βœ… This will:

  • Skip Step 1 (Convert to absolute paths)
  • Skip Step 4 (Restore relative paths)
  • Leave all image/URL references as-is

πŸ“Œ Useful when your publishing environment already handles relative paths correctly.


6️⃣ Force EPUB 2 Format (Epubli Compatibility)

Some platforms like Epubli and the Tolino network still require EPUB 2 instead of the newer EPUB 3 standard.

To ensure compatibility, use the --epub2 flag:

poetry run full-export --format=epub --cover=assets/covers/cover.jpg --epub2

βœ… This will:

  • Instruct Pandoc to export the EPUB in EPUB 2.0 format

  • Avoid common EPUB validation errors like:

    • RSC-005 Invalid metadata

    • OPF-092 Language tag issues (Deutsch (de-DE) β†’ de-DE)

πŸ“Œ Notes:

  • Only applies to EPUB output

  • Has no effect on PDF, DOCX, or Markdown

  • You can combine it with --cover and --order

Use this option only if your distribution platform explicitly requires EPUB 2.

πŸ“– Need more details about EPUB 2?

Check the full guide here:
πŸ‘‰ Export to EPUB 2 – Compatibility Guide

This page explains:

  • Why some platforms still require EPUB 2

  • How to validate your EPUB file

  • Common pitfalls and how to avoid them

  • Tips for using --epub2 effectively with Pandoc


7️⃣ Specify Language Metadata (Optional)

The script auto-detects the language from config/metadata.yaml.
However, you can override it:

poetry run full-export --lang de

🧠 Behavior:

  • If --lang is not provided, the script uses lang: from metadata.yaml

  • If both exist and mismatch, a warning is shown

  • If neither is set, defaults to 'en'

Example in config/metadata.yaml:

title: "My Book"
author: "Author Name"
lang: "en"

πŸ“ƒ Logs

All logs are saved in export.log.

To monitor live:

tail -f export.log

If errors occur, check export.log for debugging.


πŸ“‚ Project Structure

book-project/
│── manuscript/
β”‚   β”œβ”€β”€ chapters/
β”‚   β”‚   β”œβ”€β”€ 01-introduction.md
β”‚   β”‚   β”œβ”€β”€ 02-chapter.md
β”‚   β”‚   β”œβ”€β”€ ...
β”‚   β”œβ”€β”€ front-matter/
β”‚   β”‚   β”œβ”€β”€ toc.md
β”‚   β”‚   β”œβ”€β”€ preface.md
β”‚   β”‚   β”œβ”€β”€ foreword.md
β”‚   β”‚   β”œβ”€β”€ acknowledgments.md
β”‚   β”œβ”€β”€ back-matter/
β”‚   β”‚   β”œβ”€β”€ about-the-author.md
β”‚   β”‚   β”œβ”€β”€ appendix.md
β”‚   β”‚   β”œβ”€β”€ bibliography.md
β”‚   β”‚   β”œβ”€β”€ faq.md
β”‚   β”‚   β”œβ”€β”€ glossary.md
β”‚   β”‚   β”œβ”€β”€ index.md
β”‚   β”œβ”€β”€ figures/
β”‚   β”‚   β”œβ”€β”€ fig1.png
β”‚   β”‚   β”œβ”€β”€ fig2.svg
β”‚   β”‚   β”œβ”€β”€ ...
β”‚   β”œβ”€β”€ tables/
β”‚   β”‚   β”œβ”€β”€ table1.csv
β”‚   β”‚   β”œβ”€β”€ table2.csv
β”‚   β”‚   β”œβ”€β”€ ...
β”‚   β”œβ”€β”€ references.bib  # If using citations (e.g., BibTeX, APA, MLA formats supported)
│── assets/ # Images, media, illustrations (for book content, cover design, and figures)
β”‚   β”œβ”€β”€ covers/
β”‚   β”‚   β”œβ”€β”€ cover-design.png
β”‚   β”œβ”€β”€ figures/
β”‚   β”‚   β”œβ”€β”€ diagrams/
β”‚   β”‚   β”œβ”€β”€ infographics/
│── config/ # Project configuration (metadata, styling, and optional Pandoc settings)
β”‚   β”œβ”€β”€ metadata.yaml   # Title, author, ISBN, etc. (used for all formats: PDF, EPUB, MOBI)
β”‚   β”œβ”€β”€ styles.css      # Custom styles for PDF/eBook
β”‚   β”œβ”€β”€ template.tex    # LaTeX template (if needed)
│── output/             # Compiled book formats
β”‚   β”œβ”€β”€ book.pdf
β”‚   β”œβ”€β”€ book.epub
β”‚   β”œβ”€β”€ book.mobi
β”‚   β”œβ”€β”€ book.docx
│── scripts/ # Scripts and tools (initialize project, convert book, update metadata, and export formats)
β”‚   β”œβ”€β”€ full_export_book.py            # Exports book to all publishing formats with backup
│── create-project-documentation.md           # Documentation for generate the project structure
│── full-export-documentation.md              # Documentation the export
│── how-to-write.md                           # Documentation how to use the project structure and save the files
│── LICENSE                                   # If open-source
│── pyproject.toml                            # Configuration file for poetry
│── README.md                                 # Project description

βš™οΈ Image Handling Options

The script now supports three different ways of handling images. Use the one that fits your workflow:

Mode Steps Executed Paths After Export When to Use
Default (no flags) βœ… Step 1 (convert to absolute)
βœ… Step 4 (restore to relative)
βœ… Tag conversion
Restored to relative Best for Pandoc (ensures images work in PDF/EPUB/DOCX while keeping Markdown clean)
--skip-images ❌ Step 1
❌ Step 4
❌ Tag conversion
Whatever is in your Markdown Fastest option, skips all image handling (use if your Markdown is already clean)
--keep-relative-paths ❌ Step 1
❌ Step 4
βœ… Tag handling (if relevant)
Preserves relative paths Best when using <figure> or when your toolchain already supports relative paths

Note: --skip-images and --keep-relative-paths currently produce the same result (both skip Steps 1 & 4). They’re marked as mutually exclusive to avoid confusion.

flowchart TD
    A([Start]) --> B{Flag?}
    B -->|--skip-images| C[Skip Step 1\nSkip Step 4\nSkip tag conversion]
    B -->|--keep-relative-paths| D[Skip Step 1\nRun tag handling\nSkip Step 4]
    B -->|No flags| E[Run Step 1\nRun tag handling to-absolute]

    C --> F[Step 2: Prepare output folder\nEnsure metadata]
    D --> F
    E --> F

    F --> G[Step 3: Compile\nmarkdown, pdf, epub, docx]
    G --> H{After compile}

    H -->|--skip-images| I[Step 4: Skipped\npaths unchanged]
    H -->|--keep-relative-paths| J[Step 4: Skipped\npaths remain relative]
    H -->|Default| K[Step 4: Restore paths to relative\nTag handling to-relative]

    I --> L[Step 5: Validation\nepub, pdf, docx, md]
    J --> L
    K --> L
Loading

Legend

  • Run (green): step is executed

  • Skipped (dark gray): step is not executed

  • Branching is decided in this order:

    1. --skip-images β†’ skip all image-related work

    2. --keep-relative-paths β†’ only skip path rewrites (keep relative paths)

    3. No flags β†’ default behavior (absolute β†’ compile β†’ restore to relative)


πŸ”‘ Quick Reference

  • Use Default if you want maximum compatibility with Pandoc and platforms like PDF/EPUB/DOCX.
  • Use --skip-images if you want speed and have no broken links.
  • Use --keep-relative-paths if you rely on <figure> tags or know that relative paths will be resolved correctly downstream.

⚠️ Troubleshooting

1️⃣ Pandoc Not Found

If you see:

Command 'pandoc' not found

Install Pandoc:

sudo apt install pandoc  # Ubuntu/Debian
brew install pandoc  # macOS
choco install pandoc  # Windows

2️⃣ Cover Not Showing in EPUB

  • Ensure you pass --cover=...

  • Use .jpg or .png

  • Use an EPUB reader like Calibre or Thorium to verify

3️⃣ Images Missing

If you see:

[WARNING] This document format requires a nonempty <title> element.

πŸ”§ If you’re using --keep-relative-paths, make sure your target platform supports relative image references. Pandoc + LaTeX for PDF, for example, may still require absolute paths.

Ensure config/metadata.yaml exists.

  • Use absolute paths by default (--skip-images off)

  • Ensure referenced files exist in assets/

4️⃣ Pandoc Metadata Warning

Ensure config/metadata.yaml exists.
If missing, the script will automatically generate a default one.


Language mismatch warning?

You'll see:

⚠️⚠️⚠️ LANGUAGE MISMATCH DETECTED ⚠️⚠️⚠️
Metadata file says: 'de' but CLI argument is: 'en'
Using CLI argument value.

This is just a warning. It still works, but you may want to keep it consistent.


🎨 Add a Cover Image to EPUB Output

You can now pass a custom cover image for EPUB output using the --cover argument:

poetry run full-export --format=epub --cover=assets/covers/cover-image.jpg

βœ… Requirements:

  • Accepted formats: .jpg, .jpeg, .png

  • Path should be relative to the project root (or absolute)

If you omit the --cover flag, the EPUB will be generated without an embedded cover image.


⚑ Quick Export Shortcuts

We've moved the shortcut documentation to its own dedicated page:

πŸ‘‰ View the Shortcut Reference β†’


πŸ’  Emoji Replacement for KDP Compliance

To avoid issues with unsupported characters in EPUB/PDF uploads (especially for Kindle), use the emoji cleanup tool:

poetry run replace-emojis

This script:

  • Replaces emojis in Markdown files with safe printable symbols

  • Processes files inside front-matter, chapters, and back-matter

  • Uses scripts/emoji_map.py as its replacement reference

🧩 How to Extend emoji_map.py

You can easily add new emoji mappings in scripts/emoji_map.py.

Example addition:

EMOJI_MAP = {
  ...
"πŸ“±": "⌁",  # Smartphone icon β†’ symbol
"πŸ’‘": "⚑",  # Lightbulb icon β†’ lightning bolt
}

Just rerun poetry run replace-emojis and the new replacements will apply.

πŸŽ‰ Final Notes

This script helps you create a clean, professional, multi-format export of your book with:

  • πŸ“¦ automatic asset handling

  • 🌍 multi-language metadata support

  • πŸ’» full CLI integration with Poetry

  • βœ… EPUB 2 compatibility for commercial distribution

For emoji compatibility, cover images, and other enhancements, check the Wiki.


πŸš€ Now ready for use in any book project! πŸš€


Clone this wiki locally