EbookRAG

EbookRAG is a small Retrieval-Augmented Generation (RAG) toolkit for EPUB libraries. It indexes local ebooks and answers questions about them by combining open-source embeddings with either LangChain or LlamaIndex pipelines.

Requirements

Python 3.10 or newer
uv for environment and dependency management
Access to an OpenAI-compatible chat model (hosted or self-hosted)
Pandoc (installed automatically through pypandoc when missing)

Setup with uv

uv venv
source .venv/bin/activate
# install core dependencies declared in pyproject.toml
uv sync

Choose one (or both) pipeline extras:

# LangChain toolchain
uv sync --extra langchain

# LlamaIndex toolchain
uv sync --extra llamaindex

If you prefer ad-hoc installation, uv pip install '.[langchain]' or uv pip install '.[llamaindex]' works as well.

Configuration

Copy the sample environment file and adjust the values:
```
cp .env.example .env
```
Place your EPUB files inside the directory configured by EPUB_DIR (defaults to epubs/).
Update the variables in .env as needed:
- OPENAI_API_KEY, OPENAI_API_BASE_URL, OPENAI_API_MODEL, OPENAI_TEMPERATURE
- EMBED_PROVIDER and related settings to choose HuggingFace, OpenAI-compatible, or Ollama embeddings
- Chunking parameters such as CHUNK_SIZE, CHUNK_OVERLAP, CHUNK_SEPARATORS
- Optional QA_PROMPT_TEMPLATE or QA_PROMPT_TEMPLATE_PATH for custom prompts

Usage

LangChain pipeline
```
python rag_langchain.py
```
The script builds (or updates) a FAISS vector store on first run, then drops you into an interactive question prompt. Embeddings are saved under vector_store_langchain/ with a manifest for incremental updates.
LlamaIndex pipeline
```
python rag_llamaindex.py
```
This variant maintains its own FAISS index inside vector_store_llamaindex/ and offers the same interactive question loop.

Both scripts log retrieval progress and print answer references so you can trace responses back to source files.

Tips

The toolkit supports multilingual embeddings out of the box via intfloat/multilingual-e5-base.
If you switch to Ollama for embeddings, ensure the Ollama server is running and reachable.
Re-run the chosen script whenever you add or update EPUB files; manifests detect changes and refresh only the affected books.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
agents.md		agents.md
pyproject.toml		pyproject.toml
rag_agent.py		rag_agent.py
rag_langchain.py		rag_langchain.py
rag_llamaindex.py		rag_llamaindex.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

EbookRAG

Requirements

Setup with uv

Configuration

Usage

Tips

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

chroming/EbookRAG

Folders and files

Latest commit

History

Repository files navigation

EbookRAG

Requirements

Setup with uv

Configuration

Usage

Tips

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages