EduTutor.AI

Slovak AI language tutor with voice, RAG knowledge base, 4-mode Knowledge Base platform (Chat / Study / Voice / Ask), podcast generation, OmniVoice voice cloning (600+ languages incl. Slovak), cross-session memory and 3D MetaHuman avatar with proprietary slovak lipsync engine + 52-channel ARKit mode.

Grant project · SORRYWECAN s.r.o. · 09I05-03-V04-00072 · v5.0 (máj 2026)

📦 Install on Windows · one-click `.exe`

The whole stack — backend, frontend, Pixel Streaming signalling, cooked UE5 avatar — in a single double-click. No git clone, no Python, no Node, no Docker.

1 · Download

v1.0 is LIVE — download the latest installer: https://github.com/sorrywecann/edututor-ai/releases/latest

The signed EduTutor-Setup-X.exe (~600 MB) ships with every release. For the dev setup (clone + run from source), see docs/guides/FULL_STACK_SETUP.md.

The installer is much smaller than previous releases because the 3D MetaHuman avatar engine (~1.5 GB) is now shipped as a separate asset (ue5-engine-0.4.4.zip) and fetched on first launch. The all-in-one installer outgrew both the NSIS 2 GB mmap cliff and GitHub's 2 GiB asset cap, so v0.4.4 ships as two assets — see First launch below.

2 · Install

Double-click the .exe. Windows SmartScreen will warn (unsigned, we don't ship a code-signing cert yet) → click More info → Run anyway. Per-user install; you choose the folder.

Then launch EduTutor.AI from Desktop or Start Menu.

3 · First launch — internet required (one-time)

The first launch needs an internet connection for two one-time downloads:

Download	Size	Where it lands	When
UE5 avatar engine (`ue5-engine-0.4.4.zip`)	~1.5 GB	`%APPDATA%\edututor-desktop\ue5\`	Splash shows "Sťahujem avatar engine (1.5 GB, jednorazovo)" with a progress bar; resumes via `Content-Range` if interrupted, SHA-256 verified, then extracted.
Local LLM (Ollama `gemma3:4b`)	~3 GB	`%APPDATA%\edututor-desktop\ollama-models\`	First-run setup pulls it via the existing progress UI. Skippable if you paste a cloud API key (OpenAI / Anthropic / Groq) instead.

After both downloads complete, the app is fully local — no further internet needed for chat, voice input (faster-whisper bundled), avatar or RAG. The only runtime cloud dependency is Edge TTS (Microsoft's free Slovak voice, no key); a bundled Piper for offline TTS is on the v0.5 roadmap.

The app stores its version in .bundle-version inside the UE5 dir, so the engine is re-downloaded only when it actually changes between releases.

4 · After the downloads (~15–30 s per launch)

A cinematic splash opens — warm amber breathing orb, English words cycling (Waking up · Breathing · Studying · Thinking · Listening · Vibing · Flowing · Jamming), tiny status line showing each service coming up. When the stack is ready the splash cross-fades into the app, straight into the Chamber onboarding:

Step
Vitaj	Welcome — particle-constellation orb, "Tichá miestnosť na učenie."
Krok 1 · Meno	Tells the tutor what to call you
Krok 2 · Charakter	Pick a tone — Vážny / Praktický / Pohodový
Krok 3 · Pripravený	Confirmation → Vstúpiť do miestnosti →

Then the conversation page opens — click the orb to start a voice session. When the tutor speaks the constellation pulsates live with the audio.

What you'll need

	Required
OS	Windows 10 (1803+) or Windows 11, x64
Disk	~5 GB free (installer ~600 MB + ~1.5 GB UE5 engine fetched on first launch + ~3 GB local LLM + user data)
LLM	Bundled — Ollama ships in the `.exe`. First launch pulls `gemma3:4b` (~3 GB, ~5–10 min). Optionally paste a cloud API key (OpenAI / Anthropic / Groq) in the first-run screen for higher-quality answers + to skip the local pull.
STT (voice input)	Bundled — `faster-whisper` in the lean Python. First mic click downloads a ~145 MB whisper-base model once; afterwards instant + offline. No cloud key required.
TTS (voice output)	Edge TTS — Microsoft's free cloud Slovak voice. No key, but needs internet at runtime. (Future v0.5.0 will bundle Piper for offline TTS.)
GPU	Any modern GPU for the 3D MetaHuman avatar. CPU-only still shows the orb.
Mic	For voice — Windows prompts for permission on first use
Internet	For Edge TTS (cloud, free) and any cloud LLM. Fully offline = Ollama + local model only.

Switching the design

The default is the Chamber (pure black + particle orb). To flip to the warmer atmosphere variant, click the A pill at the bottom of the sidebar — one click swaps the whole app and persists.

If something doesn't work

Logs land in %APPDATA%\edututor-desktop\logs\:

File	What's in it
`launcher.log`	orchestrator events (service start / stop / restart)
`backend.log`	FastAPI + uvicorn (model load errors, port conflicts)
`frontend.log`	Next dev server output
`ue5.log`	UE5 game logs

Common gotchas:

Splash stuck on "thinking…" → Ollama isn't running and no cloud key yet. Either install Ollama + pull gemma3:12b, or click through to the first-run screen and paste an API key.
"Internal server error" in the window → :8000 or :30000 held by an orphan from a previous session. Close Electron, end stray python.exe / SlovakEdu.exe in Task Manager, relaunch.
Avatar blank / black → UE5 needs DirectX 12 + recent GPU driver. The orb-only fallback still works.

What's inside (latest features)

Atmospheric 2026 UI — dark glass design system (radial hero gradient, glass surfaces, micro-labels), avatar-first chat shell with collapsible right-side conversation drawer, ElevenLabs-style Voice Lab (tabs: Generate / My voices / Create), NotebookLM-style 3-column Knowledge workspace (Sources / Chat / Studio rail with 8 study actions)
OmniVoice voice cloning — 600+ languages incl. Slovak, ~1.2 GB model, lazy-loaded
Knowledge Base — 3-column NotebookLM grammar + Chat (conversational Q&A) · Study (review flashcards + study notes) · Voice (hands-free) · Ask (one-shot deep query, 15 sources) modes; Studio rail with Zhrnutie, Kľúčové body, Kartičky, Otázky, Jednoducho, Analýza, Akčné body, Podcast
Podcast generation — multi-speaker audio podcasts from KB documents (FFmpeg concat)
Cross-session memory (Phase 8b) — user_profile + episodic memory + auto-summarizer
3D MetaHuman avatar — 14 Slovak visemes + 52-channel ARKit blendshapes + 9 emotions + text2face presets
Conversational viseme timing — env-tunable 60/100/45 ms phoneme durations
WebSocket reconnect — exponential backoff + connection state UX
4 UE5 transport adapters — Web Browser Widget · Pixel Streaming · WS Server · Mock
7 LLM providers · 5 STT backends · 7 TTS providers (+3 via explicit dispatch) — all runtime-switchable
One-bundle Windows installer (EduTutor-Setup-X.exe, ~1.9 GB) — backend (lean self-contained CPython 3.11) + frontend (Next.js standalone) + Pixel Streaming signalling + cooked UE5 avatar in a single double-click; per-user writable data dir, no Python/Node/Docker/repo required on the target. See desktop/BUNDLE.md.
Two design variants — toggle in the sidebar — chamber (default, pure-black private-mentor chamber with particle-constellation orb and Geist + Instrument Serif italic accents) and atmosphere (warm "Living Room"). A/C pill at the bottom of the sidebar flips between them.
595+ backend tests (62 files) · k6 load testing · 354-question golden dataset

For developers


Run from source with UE5 avatar (one command, auto-downloads UE5 + Wilbur)	`.\start.ps1 -Avatar` — see `docs/guides/START_STACK.md`
Run from source, team member with sibling UE5 clone	`.\start.ps1 -Avatar -UseSiblingClone`
Build the installer from source (cook UE5 + stage resources + electron-builder)	`desktop/BUNDLE.md`
Just the web stack on Windows (no UE5 avatar)	`start.bat` (or `start.ps1`)

Recommended start path

Three supported start paths, in order of recommendation:

Quick start (most users) — .\start.ps1 (Windows) or ./start.sh (Mac/Linux). One command, runs from source. See docs/guides/START_STACK.md.
Docker (optional production deploy) — docker compose up. Use this for server deploys or fully reproducible local runs.
Packaged installer (.exe) — download from releases/latest. One double-click, no clone, no Python/Node/Docker required.

The three sections below document each path in detail.

Quick Start — klikni a spusti (Docker)

Predpoklady: Docker Desktop (Mac / Windows / Linux). Nič iné.

Stiahni / klonuj tento repozitár.
Otvor priečinok v Finder (Mac) alebo Explorer (Windows).
Dvojklik na launcher pre tvoj OS:

OS	Súbor	Stop
macOS	`./start.sh`	Ctrl+C v termináli
Windows	`start.bat` (or `start.ps1`)	Ctrl+C v termináli
Linux	`./start.sh` v termináli	Ctrl+C v termináli

Launcher si overí Docker, vyrobí .env z .env.example, postaví kontajnery a sám otvorí prehliadač na http://localhost:3000. Prvé spustenie môže trvať 5–10 minút (sťahuje sa ~2 GB modelov a obrazov). Ďalšie spustenia sú sekundové.

Quick Start — manuálne (bez Docker, pre vývoj)

Requires: Python 3.11+, Node.js 20+, pnpm (or npm)

# 1. Clone
git clone https://github.com/sorrywecann/edututor-ai.git
cd edututor-ai

# 2. Start everything
./scripts/start.sh   # Mac / Linux (dev mode, no Docker)

Manual frontend install? Run pnpm install from core/ (a project-local .npmrc already sets legacy-peer-deps=true — required for the next@15 + next-auth@4 combination). With plain npm, use npm install --legacy-peer-deps.

Open http://localhost:3000

Demo login (pre-filled — just click Sign in):

Field	Value
Email	`demo@edututor.sk`
Password	`edututor2026`

Override the password via DEMO_PASSWORD=… in core/.env if needed. Real authentication (magic-link / OAuth) is Phase 9 — out of scope for the grant prototype.

The Hardware Setup modal fires automatically on first load — it detects your machine, shows live status for every service, and applies the optimal STT / LLM / TTS config in one click. No .env editing required.

No API key needed if you have Ollama installed — it's detected automatically (no .env editing required). Pull a model based on your RAM:

ollama pull gemma3:4b      # 2.5 GB — 8GB+ RAM, decent Slovak
ollama pull qwen2.5:7b     # 4.7 GB — 12GB+ RAM, much better Slovak ← recommended
ollama pull gemma3:12b     # 8.1 GB — 16GB+ RAM, best local Slovak

Quick Start — Docker

Requires: Docker Desktop

cp .env.example .env
# Edit .env → add at least OPENAI_API_KEY or ANTHROPIC_API_KEY

docker compose up --build

Open http://localhost:3000

First run downloads ~500MB of checked-in models (~1.2 GB OmniVoice model downloads on first use via lazy-load). Subsequent starts are fast.

What works without an API key

Component	Default	Notes
STT	mlx-whisper-turbo (Apple Silicon) / faster-whisper (others)	Fully local, no API key
TTS	Edge TTS — sk-SK-LukasNeural	Free Microsoft cloud, no key
RAG	ChromaDB embedded	No Docker, no separate service
Database	SQLite	No PostgreSQL needed for dev
LLM	Ollama (local, free) or any cloud key	Hardware modal auto-detects

Free fully-offline option (Ollama)

# Mac
brew install ollama && ollama serve

# Windows / Linux — download installer at https://ollama.com/download

Pull a model (Ollama is auto-detected at startup — no .env needed):

ollama pull qwen2.5:7b   # best Slovak quality that fits in 12GB RAM
ollama pull gemma3:4b    # fallback for 8GB machines (2.5 GB)

The app detects Ollama automatically on startup and picks the best model you have installed.

LLM options

Provider	Key needed	Latency	Cost	Slovak quality	Best for
OpenAI gpt-4o-mini	Yes	~4s	~$0.01/session	★★★★★	Easiest start
Anthropic Claude Haiku	Yes	~3s	~$0.01/session	★★★★★	Alternative cloud
Groq llama-3.3-70b	Yes (free tier)	~0.5s	Free	★★★★☆	Fastest free cloud
Ollama gemma3:4b	No	~3s	Free	★★★☆☆	8GB RAM, offline
Ollama qwen2.5:7b	No	~4s	Free	★★★★☆	12GB RAM — best free mid-range
Ollama gemma3:12b	No	~10s	Free	★★★★☆	16GB RAM, best local
vLLM + Qwen2.5-32B	No	~0.5s	Free	★★★★★	RTX 4090 / power tier

Hardware-adaptive setup

EduTutor detects your hardware on first login and recommends the optimal config:

Tier	RAM / GPU	STT	LLM	TTS
Minimal	<10 GB	faster-whisper-small	openai	edge
Standard	16 GB / Apple Silicon	mlx-whisper-turbo	openai	edge
Performance	32–64 GB / M3 Max	mlx-whisper-large-v3	ollama/gemma3:27b	piper
Power	RTX 4090 24GB	faster-whisper-large (CUDA)	vLLM/Qwen2.5-32B	edge
Server	A100 / H100	faster-whisper-large (CUDA)	vLLM/Llama-3.3-70B	edge

Click Apply in the modal — all three services switch instantly, no restart.

Development

edututor-ai-sandbox/
├── core/                  Next.js 15 frontend
│   └── src/
│       ├── app/           Pages
│       ├── components/    UI components (shell, voice, chat)
│       └── hooks/         useVoiceSession, useProviderSettings
├── tutor-service/         FastAPI backend
│   └── app/
│       ├── api/           Endpoints (chat, stt, tts, llm, knowledge_bases, system)
│       ├── services/      STT, TTS, LLM, RAG, memory services
│       └── config/        LLM system prompt, RAG config
├── docker-compose.yml     Production-ready Docker stack
├── scripts/start.sh       Local dev one-command start
└── .env.example           All config options documented

Backend API docs: http://localhost:8000/docs (Swagger UI, auto-generated)

Architecture map (high-leverage entry points for contributors)

File	What it is
`tutor-service/app/api/chat.py`	The hot path. Streaming chat, tool-call loop, UE5 broadcast, Depends-injected LLM. Read this first.
`tutor-service/app/skills/`	Skill ABC + `SkillRegistry` — the modular agent platform. Drop a `Skill` subclass here and it auto-registers.
`tutor-service/app/deps.py`	FastAPI Depends providers for service injection. Test with `app.dependency_overrides`.
`tutor-service/app/services/avatar_broadcaster.py`	Snapshot-safe WebSocket fan-out to UE5 clients. v2.1 protocol.
`tutor-service/app/config/learning_modes.py`	Persona system. `enabled_skills` + `agent_type` wire skills to modes.
`core/src/lib/config.ts`	Single source of truth for `API_BASE` and `WS_BASE`. Never hardcode `process.env.NEXT_PUBLIC_API_URL` again.
`core/src/components/ErrorBoundary.tsx`	Mounted at shell layout — one component crash cannot blank the whole app.
`docs/architecture/ue5-avatar-contract.md`	Wire format for the UE5 Blueprint dev. v2.1 with optional `agentState` field.
`core/src/components/atmosphere/`	Atmospheric design system primitives (`GlassCard`, `Button`, `MicroLabel`, `AtmosphereModal`, `PageHeader`, etc.). Shared across every shell page — start here when building new UI surfaces.
`core/src/app/globals.css`	Atmospheric design tokens — `--atm-hero` radial gradient, `--atm-glass-*` translucent surfaces, micro-label typography, global form styling.
`core/src/app/(shell)/page.tsx`	Main chat shell. Avatar locked large; conversation lives in collapsible right-side `ChatDrawer` defined in the same file.
`core/src/app/(shell)/voice-lab/page.tsx`	Voice Lab — ElevenLabs-style 3-tab UI (Generovať reč / Moje hlasy / Vytvoriť hlas) with workspace + right-rail settings on the Generate tab.
`core/src/components/kb/KBWorkspace.tsx` + `KBStudio.tsx`	NotebookLM-style 3-column Knowledge layout. `STUDY_TOOLS` is the single source of truth in `core/src/lib/kb/studyTools.ts`.
`core/src/stores/useKBStore.ts`	Zustand store for KB state. `removeDocument` action enables optimistic delete (avoids 404 surfaces after stale state).

Slovak STT models (pick one in Hardware Setup)

Model ID	WER on CV21	Speed	When to pick
`mlx-whisper-turbo`	~32%	0.5s on M2	Daily driver, Apple Silicon
`slopal-whisper-large-v3-turbo-sk` ⭐	~13%	0.8s on GPU, 3s CPU	Best balance — production Slovak
`slopal-whisper-large-v3-sk`	~12%	1.2s on GPU	Maximum accuracy, prefer GPU
`slopal-whisper-small-sk`	~25%	2s CPU	Lightweight CPU fallback

SloPal fine-tunes (NaiveNeuron, EMNLP 2025, CC-BY-4.0) deliver 65–70% WER reduction over base Whisper on Slovak. Drop-in via the existing faster-whisper backend.

Phase 6 platform spine (current architecture state)

EduTutor.AI is no longer a "Slovak chat tutor" — it's an agent platform with a UE5 avatar presence layer. Four foundations landed in Phase 6:

agentState v2.1 protocol — UE5 broadcast carries optional agentState: idle | thinking | searching | writing | listening. Backwards-compatible: omitted from payload when unset, so v2 Blueprints see byte-identical traffic.
Skill ABC + SkillRegistry — drop a Skill subclass into tutor-service/app/skills/<name>/skill.py with a tools() method returning OpenAI function-calling schemas. The chat tool-call loop dispatches automatically.
Tool-call loop in chat.py — prompt-based (works with every provider: Ollama, OpenAI, Anthropic, vLLM, custom). Bypassed when enabled_skills is empty (current Slovak tutor flow), so the existing experience is unaffected.
LearningMode extension — enabled_skills: list[str] and agent_type fields wire personas to skill subsets. Same avatar, same voice, different tool inventory per mode.

595+ tests collected (62 files), zero deprecation warnings in core path.

Phase 8a + 8b identity + cross-session memory

EduTutor identifies users by a per-browser anonymous UUID — no login screen, no passwords, no email. Resolution priority on every API request: X-EduTutor-User-Id header (frontend localStorage key edututor_user_id) → edu_uid cookie (server-issued backup) → generate new UUID + set cookie.

What this enables today:

Two browsers = two separate flashcard decks (no data bleed)
Existing Phase 7 single-deck users keep their cards (transparently reassigned to a synthetic legacy user, ID persisted to data/legacy_user_id.txt)
The Slovak tutor flow is byte-identical (no per-user state visible in the chat path because sk mode bypasses tools)

What's deferred to Phase 9:

Real authentication (magic link / OAuth)
Multi-device account claiming
The User model already has nullable email + is_anonymous flag, so Phase 9 only adds the claim flow on top

Phase 8b adds cross-session memory on top of the identity foundation. Memory only activates in assistant_pro and tutor_practice_pro modes — the Slovak tutor stays session-amnesiac by design. Profile data is structured (SQLite user_profile table); episodic recall is semantic (per-user ChromaDB collection edu_memory_<uid>). After each conversation ends, a background task summarizes the session via LLM and persists it into episodic memory, so the next session can recall it. Real auth (magic-link / OAuth) is out of scope for the grant prototype. The full identity contract is documented in docs/adrs/004-anonymous-by-default-identity.md.

Switching providers at runtime

# Switch STT
curl -X POST http://localhost:8000/api/v1/stt/switch \
  -H "Content-Type: application/json" -d '{"model_id": "faster-whisper-sk-small"}'

# Switch LLM
curl -X POST http://localhost:8000/api/v1/llm/switch \
  -H "Content-Type: application/json" -d '{"provider": "ollama:gemma3:12b"}'

# Auto-apply optimal config for detected hardware
curl -X POST http://localhost:8000/api/v1/system/apply \
  -H "Content-Type: application/json" -d '{}'

# What is currently running
curl http://localhost:8000/api/v1/system/status

# Save an API key without restarting
curl -X POST http://localhost:8000/api/v1/system/config \
  -H "Content-Type: application/json" -d '{"openai_api_key": "sk-..."}'

Contributing

This project welcomes contributions. Read the entry points below before opening a PR.

Audience	Start here
First-time contributor	`CONTRIBUTING.md`
Architectural decisions	`docs/adrs/` — one ADR per invariant
Workflows (new skill, new provider, new mode)	`docs/workflows/`
Filing a bug or feature	`.github/ISSUE_TEMPLATE/`
Security disclosures	`SECURITY.md` (do NOT file public issues)
Code of Conduct	`CODE_OF_CONDUCT.md`
Changelog	`CHANGELOG.md`

One-click dev environments via .devcontainer/ — compatible with VS Code Dev Containers and GitHub Codespaces.

License

MIT · SORRYWECAN s.r.o.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.devcontainer		.devcontainer
.github		.github
core		core
deploy		deploy
desktop		desktop
docs		docs
monitoring		monitoring
nginx		nginx
scripts		scripts
test-files		test-files
tests		tests
tutor-service		tutor-service
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTEXT.md		CONTEXT.md
CONTRIBUTING.md		CONTRIBUTING.md
DESIGN.md		DESIGN.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SETUP.md		SETUP.md
consolidated-landing.png		consolidated-landing.png
consolidated-onepager.png		consolidated-onepager.png
credentials.json.example		credentials.json.example
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.release.yml		docker-compose.release.yml
docker-compose.yml		docker-compose.yml
livekit.yaml		livekit.yaml
railway.backend.toml		railway.backend.toml
railway.frontend.toml		railway.frontend.toml
render.yaml		render.yaml
start-avatar.ps1		start-avatar.ps1
start.bat		start.bat
start.ps1		start.ps1
start.sh		start.sh
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EduTutor.AI

📦 Install on Windows · one-click `.exe`

1 · Download

2 · Install

3 · First launch — internet required (one-time)

4 · After the downloads (~15–30 s per launch)

What you'll need

Switching the design

If something doesn't work

What's inside (latest features)

For developers

Recommended start path

Quick Start — klikni a spusti (Docker)

Quick Start — manuálne (bez Docker, pre vývoj)

Quick Start — Docker

What works without an API key

Free fully-offline option (Ollama)

LLM options

Hardware-adaptive setup

Development

Architecture map (high-leverage entry points for contributors)

Slovak STT models (pick one in Hardware Setup)

Phase 6 platform spine (current architecture state)

Phase 8a + 8b identity + cross-session memory

Switching providers at runtime

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EduTutor.AI

📦 Install on Windows · one-click .exe

1 · Download

2 · Install

3 · First launch — internet required (one-time)

4 · After the downloads (~15–30 s per launch)

What you'll need

Switching the design

If something doesn't work

What's inside (latest features)

For developers

Recommended start path

Quick Start — klikni a spusti (Docker)

Quick Start — manuálne (bez Docker, pre vývoj)

Quick Start — Docker

What works without an API key

Free fully-offline option (Ollama)

LLM options

Hardware-adaptive setup

Development

Architecture map (high-leverage entry points for contributors)

Slovak STT models (pick one in Hardware Setup)

Phase 6 platform spine (current architecture state)

Phase 8a + 8b identity + cross-session memory

Switching providers at runtime

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

📦 Install on Windows · one-click `.exe`

Packages