Voice-first mock interview platform that pairs a FastAPI/LangGraph backend with a Vite + React frontend. Candidates speak answers in the browser, the interviewer agent responds with synthesized audio, and background agents score performance and surface harder follow-ups.
- Architecture
- Agent Orchestration (LangGraph)
- Key Features
- Directory Layout
- Prerequisites
- Environment Variables
- Getting Started
- Interview Flow
- Development Notes
- Troubleshooting
- Backend (
backend/): FastAPI app with LangGraph agents, async SQLAlchemy, and ElevenLabs TTS integration. Serves/apiendpoints and staticmedia/files. - Agents (
backend/app/core/agents/):onboarding_agent: Reads resume + JD, seeds a problem set, and persists session metadata.knowledge_agent: Researches gaps in the background while onboarding finishes.interviewer_agent: Manages the Q&A loop, storing each question/answer pair.scoring_agent: Grades answered interactions asynchronously.hard_question_agent: Generates additional “hard mode” follow-ups when criteria are met.
- Database (
backend/app/db): PostgreSQL with Alembic migrations.InterviewSessionandInterviewInteractiontrack state, audio URLs, transcript, scores, etc. - Frontend (
frontend/): React + Vite UI that records audio withMediaRecorder, auto-transcribes with the Web Speech API (Chrome), and calls endpoints viafrontend/src/api/interviews.js. - Media Pipeline: Assistant speech synthesized via ElevenLabs, stored under
media/interviews/*, and served through/media.
The heart of the system is a web of LangGraph workflows that coordinate onboarding, knowledge enrichment, interviewing, scoring, and hard-question generation. Each agent compiles a StateGraph with InterviewState or related TypedDicts to keep track of the session lifecycle.
graph TD
A[POST /api/interviews] -->|init_state| B[Onboarding Graph]
B -->|problem_set + max_index| C[Knowledge Graph]
C -->|research notes| D[Interview Graph]
D -->|question asked| E[User Voice Answer]
E -->|audio+text| F[submit_answer]
F -->|BackgroundTasks| G[Scoring Graph]
F -->|BackgroundTasks| H[Hard Question Graph]
H -->|new difficult Q| D
G -->|grade_data| Review[Review UI/Session Summary]
- Workflow (
agent.py) seedsInterviewStatefrom the raw JD/resume. - Nodes populate candidate data, generate an initial
problem_set, and setmax_index. - Uses
resume_pdf_inputbytes if provided.
- Runs asynchronously after onboarding via
background.add_task. - Provides additional context (e.g., recent company news) that downstream nodes can merge into
reference_data.
- Defined in
agent.py. - Nodes:
ensure_session_node: creates a DB session if missing.speak_node: fetchesnext_unansweredfrom the repo and emitsAIMessage.save_response_node: writes the user’s transcript to the exactcurrent_index.
- Uses LangGraph’s
MemorySaversocurrent_indexpersists between requests.
- Invoked once per answer (
_run_background_once). - Pulls
next_ungraded_answered, grades it, and storesgrade_data. - Outputs accuracy/communication/completeness plus textual feedback.
- Also triggered in the background.
- Checks if the base question’s criteria for a “hard mode” follow-up are met (e.g., good score or certain tags).
- Inserts new
InterviewInteractionrows referencing the parent viareference_data.meta.parent_order_index.
- InterviewState: Extends
MessagesStatewith fields likeproblem_set,current_index, andready_question_index. Thereduce_problemsreducer keeps the list manageable. - InterviewRepo: Encapsulates all DB operations (append, upsert, merge reference data, fetch pending questions, etc.), ensuring consistency even when multiple agents run concurrently.
- Create Session → Onboarding graph seeds data.
- Knowledge Agent runs in background to enrich context.
- Start Interview → Interviewer graph retrieves the next unanswered question.
- Answer Submission → Transcript/audio saved, interviewer graph continues for the next question, and background Scoring + Hard agents run once.
- Completion → Detected via
_detect_completedor when all questions up tomax_indexare answered.
This modular LangGraph setup lets you extend behavior by adding nodes or entire graphs (e.g., behavioral interviews, coding challenges) while keeping each concern isolated yet orchestrated through shared session state and repositories.
- Voice-first interview loop with preview + re-record controls before submission.
- Automatic speech-to-text on the client; backend requires both audio and transcript.
- Persistent LangGraph state using
MemorySaverto keep track ofcurrent_index. - Background scoring and hard-question generation triggered once per answer.
- TTS caching: generated question audio URLs saved back to the interaction record for replays/polling.
- History endpoints to list prior sessions per authenticated user.
backend/
app/
auth/ # JWT + login routes
core/ # LangGraph states and agent nodes
agents/
onboarding_agent/
knowledge_agent/
interviewer_agent/
scoring_agent/
hard_question_agent/
db/ # SQLAlchemy models, repos, migrations
routes/
interviews_api.py
main.py # FastAPI entrypoint
frontend/
src/
api/ # Axios wrappers
pages/
InterviewPage.jsx
package.json
docker-compose.yml # Postgres + backend services
requirements.txt # Python dependencies
- Python 3.11+
- Node.js 20+ (for Vite/React)
- PostgreSQL 15 (local or via Docker)
- Optional: Docker + Docker Compose v2
- ElevenLabs API key for TTS, Google/SERP keys for research agents, and JWT secret for auth.
Create a .env at the repo root (never commit real secrets).
| Variable | Purpose |
|---|---|
POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB, POSTGRES_HOST, POSTGRES_PORT |
Database connection info. Use db as host inside Docker. |
ELEVENLABS_API_KEY, ELEVENLABS_VOICE_ID, ELEVENLABS_MODEL_ID |
Audio synthesis for interviewer prompts. |
MEDIA_ROOT, MEDIA_BASE_URL |
Where generated audio files live and how they are served. Defaults: media and /media. |
SERPER_API_KEY, GOOGLE_API_KEY, GOOGLE_CSE_ID |
Knowledge research integrations. |
JWT_SECRET_KEY |
Signing key for auth tokens. |
| Any other provider keys referenced by LangChain integrations | Additional integrations / tools. |
cd backend
python -m venv .venv
# Windows
.\.venv\Scripts\activate
# macOS/Linux
# source .venv/bin/activate
pip install -r ../requirements.txt
alembic upgrade head
uvicorn app.main:app --reloadcd frontend
npm install
npm run dev # http://localhost:5174docker compose up --build
# backend exposed on :8000, Postgres on :5433 (if configured that way)-
Create Session –
POST /api/interviewswith JD + resume (text or PDF).
Onboarding agent seeds the problem set and storesmax_index. -
Start Interview –
POST /api/interviews/{session_id}/start
Fetches the next unanswered question. Generates ElevenLabs audio if missing and returns:- question text
- audio URL (if available)
If
status="waiting", the frontend polls every ~1.2s until a question is ready. -
Record Answer
Browser records viaMediaRecorderand transcribes via Web Speech API. Users can preview or re-record before submission. -
Submit Answer –
POST /api/interviews/{session_id}/answer(multipart form)
Payload:answer_audio(webm)answer_text(transcript)
Backend:
- Saves audio/transcript to the latest asked question (guards against index drift).
- Triggers
SCORING_APPandHARD_APPbackground cycles. - Runs interviewer graph again with
current_indexto fetch the next question or detect completion.
-
Completion
When_detect_completedreturns true or all questions up tomax_indexare answered, the session is marked completed and the frontend navigates to review.
MEDIA_ROOT/interviewsis created automatically; ensure:app.mount("/media", StaticFiles(...))is active for playback.
SpeechRecognitionworks only in Chromium-based browsers; frontend should surface warnings otherwise.- Agents rely on LangChain + LangGraph; see
backend/app/core/agents/**/*when extending workflows. - Repo functions like
get_current_asked_unanswered,next_unanswered, andsave_user_answerensure question-order integrity. Always work through the repo layer instead of direct session queries. - Background tasks use FastAPI’s
BackgroundTasks; avoid long-running synchronous code inside main endpoints.
-
Repeated Question 0 / Wrong Index
Ensure/answeruses the DB’s current index. Repo guard rails (e.g.,get_current_asked_unanswered) prevent overwriting earlier answers. -
No Audio Playback
Verify ElevenLabs credentials and that/mediais mounted. Some browsers block autoplay; the frontend should retry or require a user gesture. -
Stuck on “Generating next question…”
Check knowledge/interviewer agent logs and confirmHARD_APP/SCORING_APParen’t blocking. You can inspect LangGraph checkpoints in_CHECKPOINTER. -
Database errors
Ensure Alembic migrations are applied (alembic upgrade head) and Docker/Postgres credentials match.env.
Happy interviewing! Contributions welcome via pull requests and issues—please include tests or repro steps where possible.