CourseTA is an Agentic AI-powered teaching assistant that helps educators process educational content, generate questions, create summaries, and build Q&A systems.
- File Upload: Upload PDF documents or audio/video files for automatic text extraction
- Question Generation: Create True/False or Multiple Choice questions from your content
- Content Summarization: Extract main points and generate comprehensive summaries
- Question Answering: Ask questions and get answers specific to your uploaded content
CourseTA.Demo.mp4
- Python 3.9+
- Dependencies listed in
requirements.txt
- FFmpeg (for audio/video processing)
- Ollama (optional, for local LLM support)
-
Clone this repository:
https://github.com/Sh-31/CourseTA.git cd CourseTA
-
Install FFmpeg:
Linux (Ubuntu/Debian):
sudo apt update sudo apt install ffmpeg
-
Install the required Python packages:
pip install -r requirements.txt
-
(Optional) Install Ollama for local LLM support:
Windows/macOS/Linux:
- Download and install from https://ollama.ai/
- Or use the installation script:
curl -fsSL https://ollama.ai/install.sh | sh
Pull the recommended model:
ollama pull qwen3:4b
-
Set up your environment variables (API keys, etc.) in a
.env
file.Update
.env
with your credentials:cp .env.example .env
-
Start the FastAPI backend:
python main.py
-
In a separate terminal, start the Gradio UI:
python gradio_ui.py
CourseTA uses a microservice architecture with agent-based workflows:
- FastAPI backend for API endpoints
- LangChain-based processing pipelines with multi-agent workflows
- LangGraph for LLM orchestration
CourseTA implements three main agent graphs, each designed with specific nodes, loops, and reflection mechanisms:
The Question Generation agent follows a human-in-the-loop pattern with reflection capabilities:
Nodes:
- QuestionGenerator: Initial question creation from content
- HumanFeedback: Human interaction node with interrupt mechanism
- Router: Decision node that routes based on feedback type
- QuestionRefiner: Automatic refinement using AI feedback
- QuestionRewriter: Manual refinement based on human feedback
Flow:
Question.Generation.Graph.Flow.mp4
- Starts with question generation
- Enters human feedback loop with interrupt
- Router decides:
save
(END),auto
(refiner), orfeedback
(rewriter) - Both refiner and rewriter loop back to human feedback for continuous improvement
The Summarization agent uses a two-stage approach with iterative refinement:
Nodes:
- SummarizerMainPointNode: Extracts key points and creates table of contents
- SummarizerWriterNode: Generates detailed summary from main points
- UserFeedbackNode: Human review and feedback collection
- SummarizerRewriterNode: Refines summary based on feedback
- Router: Routes to save or continue refinement
Flow:
summarztion_graph_flow.mp4
- Sequential processing: Main Points → Summary Writer → User Feedback
- Feedback loop: Router directs to rewriter or completion
- Rewriter loops back to user feedback for iterative improvement
The Q&A agent implements intelligent topic classification and retrieval:
Nodes:
- QuestionClassifier: Analyzes question relevance and retrieves context
- OnTopicRouter: Routes based on question relevance to content
- Retrieve: Fetches relevant document chunks using semantic search
- GenerateAnswer: Creates contextual answers from retrieved content
- OffTopicResponse: Handles questions outside the content scope
Flow:
Question.Answer.flow.mp4
- Question classification with embedding-based relevance scoring
- Conditional routing: on-topic questions go through retrieval pipeline
- Off-topic questions receive appropriate redirect responses
- No loops - single-pass processing for efficiency
Human-in-the-Loop Design:
- Strategic interrupt points for human feedback
- Continuous refinement loops in generation and summarization
- User control over when to complete or continue refining
Reflection Agent Architecture:
- Feedback incorporation mechanisms
- History tracking for context preservation
- Iterative improvement through dedicated refiner/rewriter nodes
CourseTA implements a comprehensive async API architecture that supports both synchronous and streaming responses, providing real-time user experiences and efficient resource utilization.
Upload PDF documents or audio/video files for text extraction and processing.
URL: /upload_file/
Method: POST
Content-Type: multipart/form-data
Request Body:
file: Upload file (PDF, audio, or video format)
Response:
{
"message": "File processed successfully",
"id": "uuid-string",
"text_path": "path/to/extracted_text.txt",
"original_file_path": "path/to/original_file"
}
Supported Formats:
- PDF:
.pdf
files - Audio:
.mp3
,.wav
formats - Video:
.mp4
,.avi
,.mov
,.mkv
,.flv
formats
Retrieve the processed text content for a given asset ID.
URL: /get_extracted_text/{asset_id}
Method: GET
Path Parameters:
asset_id
: The unique identifier returned from file upload
Response:
{
"asset_id": "uuid-string",
"extracted_text": "Full text content..."
}
Generate questions from uploaded content with human-in-the-loop feedback.
URL: /api/v1/graph/qg/start_session
Method: POST
Request Body:
Parameters:
asset_id
: Asset ID from file upload (required)question_type
: Question type - "T/F" for True/False or "MCQ" for Multiple Choice (required)
Response:
{
"thread_id": "uuid-string",
"status": "interrupted_for_feedback",
"data_for_feedback": {
"generated_question": "string",
"options": ["string"], // or null
"answer": "string",
"explanation": "string",
"message": "string"
},
"current_state": {}
}
Provide feedback to refine generated questions or save the current question.
URL: /api/v1/graph/qg/provide_feedback
Method: POST
Request Body:
{
"thread_id": "uuid-string",
"feedback": "string"
}
Parameters:
thread_id
: Session ID from start_session (required)feedback
: Feedback text, "auto" for automatic refinement, or "save" to finish (required)
Response:
{
"thread_id": "uuid-string",
"status": "completed", // or "interrupted_for_feedback"
"final_state": {} // or null
}
Generate content summaries with real-time streaming output.
URL: /api/v1/graph/summarizer/start_session_streaming
Method: POST
Content-Type: text/event-stream
Request Body:
{
"asset_id": "uuid-string"
}
Parameters:
asset_id
: Asset ID from file upload (required)
Streaming Response Events:
data: {"thread_id": "uuid", "status": "starting_session"}
data: {"event": "token", "token": "text", "status_update": "main_point_summarizer"}
data: {"event": "token", "token": "text", "status_update": "summarizer_writer"}
data: {"event": "stream_end", "thread_id": "uuid", "status_update": "Stream ended"}
Refine summaries based on user feedback with streaming responses.
URL: /api/v1/graph/summarizer/provide_feedback_streaming
Method: POST
Content-Type: text/event-stream
Request Body:
{
"thread_id": "uuid-string",
"feedback": "string"
}
Parameters:
thread_id
: Session ID from start_session_streaming (required)feedback
: Feedback text or "save" to finish (required)
Streaming Response Events:
data: {"thread_id": "uuid", "status": "resuming_with_feedback"}
data: {"event": "token", "token": "text", "status_update": "summarizer_rewriter"}
data: {"event": "stream_end", "thread_id": "uuid", "status_update": "Stream ended"}
Answer questions based on uploaded content with streaming responses.
URL: /api/v1/graph/qa/start_session_stream
Method: POST
Content-Type: text/event-stream
Request Body:
{
"asset_id": "uuid-string",
"initial_question": "string"
}
Parameters:
asset_id
: Asset ID from file upload (required)initial_question
: The first question to ask about the content (required)
Streaming Response Events:
data: {"type": "metadata", "thread_id": "uuid", "asset_id": "uuid"}
data: {"type": "token", "content": "answer text..."}
data: {"type": "complete"}
Continue an existing Q&A session with follow-up questions.
URL: /api/v1/graph/qa/continue_conversation_stream
Method: POST
Content-Type: text/event-stream
Request Body:
{
"thread_id": "uuid-string",
"next_question": "string"
}
Streaming Response Events:
data: {"type": "metadata", "thread_id": "uuid"}
data: {"type": "token", "content": "answer text..."}
data: {"type": "complete"}
Required Headers:
Accept: text/event-stream
Cache-Control: no-cache
Connection: keep-alive