Your invention. Your machine. Your patent. Zero data leaks.
Stop uploading your billion-dollar idea to ChatGPT's servers. Stop waiting weeks for a first draft.
mPAPA is a fully local AI patent drafting system. It runs on YOUR machine*. Your invention data never touches the internet. Ever.
*Any recent PC will do. Just 6 GB+ (V)RAM for LLM. GPU optional (but strongly recommended). Works great with models like Gemma-4-E2B.
π Searches 5 patent/paper databases simultaneously β EPO, Google Patents, Google Scholar, ArXiv, PubMed. 100+ references collected and analyzed automatically.
π¬ AI Chat over your entire research β Ask questions, explore prior art, refine your claims. RAG-powered, context-aware, backed by every reference you've collected.
β‘ 9-step AI workflow generates your complete patent draft:
- Invention disclosure review
- Claims drafting (European patent format)
- Prior art landscape analysis
- Novelty assessment
- Consistency review
- Market potential analysis
- Legal & IP clarification
- Disclosure summary
- Full patent specification
Every step: AI generates β you review β you edit β you continue. Full control. Full transparency.
π Export to DOCX β Professional formatting, proper styles, header/footer, ready for your attorney.
π Agent Personality Modes β Not every step needs the same attitude. Switch between Critical (skeptical, no sugarcoating), Neutral (balanced, just the facts), and Innovation-Friendly (opportunity-focused, constructive). Novelty analysis tears your claims apart by default. Disclosure keeps it open-minded. You set the tone per agent, per topic β because a patent draft needs a devil's advocate, not a cheerleader.
Everything persisted locally in SQLite β per topic, per step, per setting. Close the browser, shut down your machine, come back in a week. Your entire workflow, research, drafts, and agent configurations are exactly where you left them.
| mPAPA | ChatGPT / Claude | Patent Attorney | |
|---|---|---|---|
| Data privacy | 100% local | Your IP on their servers | NDA required |
| Cost | Free / Open Source | $20-200/month | $5,000-50,000 |
| Speed | 30 minutes | Hours of prompting | 2-8 weeks |
| Prior art search | 5 sources, automated | Manual, one at a time | Manual, expensive |
| Structured workflow | 9-step guided process | Unstructured chat | Black box |
| Editable at every step | Yes | Start over | Revision rounds |
| Works offline | Yes | No | No |
9 specialized AI agents work together in a coordinated pipeline β each one focused on a single task (disclosure analysis, claims drafting, novelty assessment, consistency review, β¦). They share context through a structured state graph, so every agent builds on what the previous ones discovered.
- LM Studio β Run any open-source LLM locally. Llama, Qwen, Mistral. Your choice.
- LangGraph β Multi-step AI workflows with human-in-the-loop review at every stage.
- DSPy β Structured, optimizable prompts. Not fragile prompt strings.
- LlamaIndex β RAG over your entire reference collection. Local embeddings.
- NiceGUI β Clean web interface. No Electron bloat. No cloud dependency.
- SQLite β Everything persisted. Close the app, reopen tomorrow, pick up where you left off.
- Independent inventors who can't afford $15K for a patent application
- Startup founders who need to file fast before competitors
- R&D engineers who want a solid first draft before engaging counsel
- Patent professionals who want AI assistance without data privacy risks
- Academic researchers exploring patentability of their work
Why we don't provide an .exe file:
We've decided to temporarily discontinue providing Windows executables. Here's why:
- False virus warnings: Windows Defender and other antivirus software flag Python-packaged executables as malicious, even though they're completely safe. This creates unnecessary concern and a poor user experience.
- Build reliability issues: The automated exe build process is unstable, takes over 2 hours, and occasionally produces inconsistent results.
- Trust and transparency: We believe you shouldn't have to blindly trust an executable file. Running from source lets you see exactly what you're running.
Good news: With modern Python tooling, running from source is nearly as simple as double-clicking an exe! See Option B below.
The reliable, transparent, and surprisingly easy way to run mPAPA.
- Install
uv(a fast Python package manager) - takes 30 seconds β uv getting started - Download the source code or clone with git:
git clone https://github.com/OWNER/REPO.git - Copy
.env.exampleto.envand adjust settings if needed - Run
uv run mpapa(the first time it downloads all needed libraries automatically) - Open
http://localhost:8080in your browser
Why this is actually better than an exe:
- β No antivirus warnings or security concerns
- β Smaller download size
- β Always up-to-date dependencies
- β Full transparency - inspect any part of the code
- β Easier to update and modify
On first launch, mPAPA automatically creates data/ and logs/ folders.
mPAPA needs a local LLM running via LM Studio. Here's how to get it going:
- Download from lmstudio.ai (Windows, macOS, Linux)
- Run the installer β no special configuration needed
Open LM Studio and search for a model in the Discover tab. Recommended options for small machines:
| Model | Size | Notes |
|---|---|---|
| Gemma 3 4B | ~3 GB | Fast, good quality, runs on most machines |
| Qwen 2.5 7B | ~5 GB | Strong reasoning, good for patent language |
| Llama 3.1 8B | ~5 GB | Well-rounded, widely tested |
| Mistral 7B | ~4 GB | Good balance of speed and quality |
Pick one that fits your RAM. 8 GB system RAM is enough for 4B models, 16 GB for 7-8B models. A GPU speeds things up but isn't required (but strongly recommended).
In LM Studio's Discover tab, also search for and download an embedding model:
- nomic-embed-text-v1.5 (~270 MB) or any other "embed" model
- Go to the Developer tab in LM Studio
- Load your chosen chat model
- Load the embedding model
- Click Start Server
- The server runs at
http://localhost:1234/v1by default β this matches mPAPA's default config
That's it. Leave LM Studio running and start mPAPA.
Tip: If you change the LM Studio port or URL, update
PATENT_LM_STUDIO_BASE_URLin your.envfile.
- Create a new topic in the left sidebar
- Insert your patent idea and some search terms
- Add your local reference resources/papers/patents
- Start search - reference patents and other literature will be downloaded and a RAG database is build
- Chat with AI - ask for better key words, first patent idea improvements and refine/repeat from step 2 - al based on found references and your input
- Click "Generate Patent Draft" to start the agent workflow
- The system walks through disclosure, search, analysis, drafting, and review
- Edit claims and description in the expandable editors
- Change and re-run as you like
- Export to DOCX when ready
You can make a break whenever you want - all results are saved to a local db and are recalled when you select a topic.
Tip: You can easily change the appearance of the export by customising the template in ./src/export/templates.
All settings are managed via environment variables with the PATENT_ prefix, or a .env file in the project root. See .env.example for all available options.
uv run pytest -qsrc/patent_system/
βββ main.py # App entry point
βββ config.py # Pydantic Settings
βββ logging_config.py # Structured JSON logging
βββ exceptions.py # Custom exception hierarchy
βββ db/ # SQLite schema, models, repositories
βββ agents/ # LangGraph workflow and agent nodes
βββ dspy_modules/ # DSPy signatures and module wrappers
βββ rag/ # Embedding service, RAG engine, citation graph
βββ parsers/ # Source-specific parsers (DEPATISnet, ArXiv, etc.)
βββ export/ # DOCX exporter with template support
βββ monitoring/ # Background prior art monitoring scheduler
βββ gui/ # NiceGUI web interface panels
Getting crashes or errors when the agent tries to analyze your collected references? Your LLM's context window is probably too small.
Fix: In LM Studio's Developer tab β Model settings β Crank Context Length to maximum (usually 32768, 40960, or even 262144 and more depending on your model). Prior art analysis needs to juggle multiple patent abstracts at once β give it room to breathe.
If the AI agent seems frozen in eternal contemplation (spinning wheel of death, no output for minutes), you're probably running on CPU only.
Fix: In LM Studio's Developer tab β Model settings β Check GPU Offload. If it's set to 0, you're not using your GPU at all. Bump it up β start with 50% of your model's layers, adjust from there. No GPU? Consider a smaller model (4B instead of 7B) or accept that patent drafting just became your new meditation practice.
Pro tip: Watch LM Studio's performance metrics while mPAPA runs. If you see 100% CPU and 0% GPU usage, that's your problem right there.
See LICENSE.
mPAPA β because your invention is too valuable to upload to someone else's server.
Built by koehler. Open source. Local first. Always.