EbookRAG is a small Retrieval-Augmented Generation (RAG) toolkit for EPUB libraries. It indexes local ebooks and answers questions about them by combining open-source embeddings with either LangChain or LlamaIndex pipelines.
- Python 3.10 or newer
- uv for environment and dependency management
- Access to an OpenAI-compatible chat model (hosted or self-hosted)
- Pandoc (installed automatically through
pypandocwhen missing)
uv venv
source .venv/bin/activate
# install core dependencies declared in pyproject.toml
uv syncChoose one (or both) pipeline extras:
# LangChain toolchain
uv sync --extra langchain
# LlamaIndex toolchain
uv sync --extra llamaindexIf you prefer ad-hoc installation, uv pip install '.[langchain]' or uv pip install '.[llamaindex]' works as well.
- Copy the sample environment file and adjust the values:
cp .env.example .env
- Place your EPUB files inside the directory configured by
EPUB_DIR(defaults toepubs/). - Update the variables in
.envas needed:OPENAI_API_KEY,OPENAI_API_BASE_URL,OPENAI_API_MODEL,OPENAI_TEMPERATUREEMBED_PROVIDERand related settings to choose HuggingFace, OpenAI-compatible, or Ollama embeddings- Chunking parameters such as
CHUNK_SIZE,CHUNK_OVERLAP,CHUNK_SEPARATORS - Optional
QA_PROMPT_TEMPLATEorQA_PROMPT_TEMPLATE_PATHfor custom prompts
-
LangChain pipeline
python rag_langchain.py
The script builds (or updates) a FAISS vector store on first run, then drops you into an interactive question prompt. Embeddings are saved under
vector_store_langchain/with a manifest for incremental updates. -
LlamaIndex pipeline
python rag_llamaindex.py
This variant maintains its own FAISS index inside
vector_store_llamaindex/and offers the same interactive question loop.
Both scripts log retrieval progress and print answer references so you can trace responses back to source files.
- The toolkit supports multilingual embeddings out of the box via
intfloat/multilingual-e5-base. - If you switch to Ollama for embeddings, ensure the Ollama server is running and reachable.
- Re-run the chosen script whenever you add or update EPUB files; manifests detect changes and refresh only the affected books.