A Django-based backend API for interacting with PDF documents using AI. This application leverages GROQ's LLM to provide intelligent document analysis, question answering, summarization, and question generation capabilities.
- Features
- Architecture
- Tech Stack
- Project Structure
- Getting Started
- Environment Configuration
- API Documentation
- Frontend Repository
- License
- Question Answering: Ask questions about your PDF documents and get AI-powered answers
- Document Summarization: Generate concise summaries of PDF content (configurable page ranges)
- Question Generation: Automatically generate relevant questions from document content
- PDF Processing: Upload and process PDF files with text extraction
- RESTful API: Clean, versioned API endpoints (v1)
- Enterprise Architecture: Domain-driven design with clear separation of concerns
- Comprehensive Error Handling: Custom exceptions with detailed error responses
- Configurable AI Models: Support for different GROQ model configurations
- Logging & Monitoring: Structured logging with JSON/text format support
This project follows Clean Architecture principles with the following layers:
chat_bot_api/
├── api/ # API Layer (Controllers)
│ ├── middlewares/ # Custom middleware
│ └── v1/ # API version 1
│ ├── views.py # Request handlers
│ ├── serializers.py # Input validation
│ └── validators.py # Custom validators
│
├── application/ # Application Layer (Use Cases)
│ ├── dto/ # Data Transfer Objects
│ │ ├── request_dto.py # Request DTOs
│ │ └── response_dto.py # Response DTOs
│ ├── services/ # Business logic services
│ │ ├── agent_service.py # AI agent orchestration
│ │ ├── question_answer_service.py # Q&A service
│ │ ├── summary_service.py # Summarization service
│ │ └── question_generation_service.py # Question generation
│ └── use_cases/ # Application use cases
│
├── domain/ # Domain Layer (Business Logic)
│ ├── enums/ # Enumerations
│ │ ├── action_types.py # Action type enums
│ │ └── user_types.py # User type enums
│ ├── exceptions/ # Custom exceptions
│ │ ├── base.py # Base exception classes
│ │ ├── agent_exceptions.py
│ │ └── pdf_exceptions.py
│ └── models/ # Domain models
│
├── infrastructure/ # Infrastructure Layer
│ ├── cache/ # Caching implementations
│ ├── external/ # External service integrations
│ ├── repositories/ # Data access layer
│ └── storage/ # File storage (local/S3)
│
├── core/ # Core Utilities
│ ├── decorators/ # Custom decorators (e.g., retry)
│ ├── mixins/ # Reusable mixins
│ └── utils/ # Helper utilities
│ ├── logger.py # Logging utilities
│ ├── helpers.py # General helpers
│ └── validators.py # Validation utilities
│
└── tests/ # Test Suite
├── unit/ # Unit tests
└── integration/ # Integration tests
Framework & Core
- Django 4.2+
- Django REST Framework
- Django CORS Headers
AI & ML
- Phidata - AI agent framework
- GROQ API - LLM inference
PDF Processing
- PyMuPDF (fitz) - PDF text extraction
Database
- SQLite (development)
- PostgreSQL support (production-ready)
Production Tools
- Gunicorn - WSGI server
- WhiteNoise - Static file serving
- python-dotenv - Environment management
file-talk-ai/
├── chat_bot_api/ # Main application
├── config/ # Configuration module
│ ├── env_config.py # Environment configuration
│ ├── constants.py # Application constants
│ └── settings/ # Django settings
├── file_talk_ai_project/ # Django project settings
├── media/ # Uploaded files (gitignored)
├── manage.py # Django management script
├── requirements.txt # Python dependencies
└── .env.example # Environment variables template
- Python 3.8+
- pip (Python package manager)
- Virtual environment (recommended)
- GROQ API key (Get one here)
-
Clone the repository
git clone https://github.com/AkshayPanchivala/File-Talk-AI-Backend.git cd File-Talk-AI-Backend-main/file-talk-ai -
Create and activate virtual environment
# Windows python -m venv venv venv\Scripts\activate # macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
# Create .env file (see Environment Configuration section) cp .env.example .env # Edit .env and add your GROQ API key
-
Run database migrations
python manage.py migrate
-
Start the development server
python manage.py runserver
The API will be available at http://localhost:8000
Create a .env file in the project root (file-talk-ai/) directory:
# ===================================
# Required Configuration
# ===================================
groqApiKey=your_groq_api_key_here
# ===================================
# AI Model Configuration (Optional)
# ===================================
GROQ_MODEL_ID=llama-3.3-70b-versatile
AI_TEMPERATURE=0.7
AI_MAX_TOKENS=8000
# ===================================
# PDF Processing (Optional)
# ===================================
PDF_DOWNLOAD_TIMEOUT=30
PDF_MAX_PAGES=100
PDF_DEFAULT_MIN_PAGE=1
PDF_DEFAULT_MAX_PAGE=5
PDF_STORAGE_PATH=media/pdfs
# ===================================
# Summary Configuration (Optional)
# ===================================
SUMMARY_MIN_WORDS=8000
# ===================================
# Question Generation (Optional)
# ===================================
QUESTIONS_COUNT=20
# ===================================
# Cache Configuration (Optional)
# ===================================
CACHE_ENABLED=False
CACHE_TTL=3600
REDIS_URL=redis://localhost:6379/0
# ===================================
# Storage Configuration (Optional)
# ===================================
STORAGE_BACKEND=local # Options: local, s3
# AWS_ACCESS_KEY_ID=your_aws_access_key
# AWS_SECRET_ACCESS_KEY=your_aws_secret_key
# AWS_STORAGE_BUCKET_NAME=your_bucket_name
# AWS_S3_REGION_NAME=us-east-1
# ===================================
# Logging Configuration (Optional)
# ===================================
LOG_LEVEL=INFO # Options: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_FORMAT=json # Options: json, text
# ===================================
# Feature Flags (Optional)
# ===================================
ENABLE_RATE_LIMITING=False
ENABLE_MONITORING=False
# ===================================
# Environment (Optional)
# ===================================
ENVIRONMENT=development # Options: development, testing, production
DEBUG=True- Visit GROQ Console
- Sign up or log in to your account
- Navigate to API Keys section
- Click Create API Key
- Copy the generated key
- Add it to your
.envfile
http://localhost:8000/api/v1/chat-bot/
GET /api/v1/chat-bot/conversation/Response:
{
"message": "File Talk AI - Conversation API",
"version": "v1",
"endpoints": {
"POST /conversation/": "Process conversation",
"POST /options/": "Get available options"
}
}POST /api/v1/chat-bot/conversation/
Content-Type: application/jsonRequest Body:
{
"action": "question_answer",
"documentUrl": "https://example.com/document.pdf",
"question": "What is the main topic of this document?"
}Response:
{
"data": "The main topic of this document is...",
"message": "Question answered successfully"
}POST /api/v1/chat-bot/conversation/
Content-Type: application/jsonRequest Body:
{
"action": "summarizer",
"documentUrl": "https://example.com/document.pdf",
"minPage": 1,
"maxPage": 5
}Response:
{
"data": "Summary of the document...",
"message": "Document summarized successfully"
}POST /api/v1/chat-bot/conversation/
Content-Type: application/jsonRequest Body:
{
"action": "generate_questions",
"documentUrl": "https://example.com/document.pdf",
"minPage": 1,
"maxPage": 5
}Response:
{
"data": [
"Question 1?",
"Question 2?",
...
],
"message": "Questions generated successfully"
}POST /api/v1/chat-bot/options/
Content-Type: application/jsonRequest Body:
{
"startedChatbot": true
}Response:
{
"options": [
"question_answer",
"summarizer",
"generate_questions"
]
}question_answer- Answer questions from PDF contentsummarizer- Summarize PDF documentgenerate_questions- Generate questions from PDF content
All errors follow this structure:
{
"error": {
"code": "ERROR_CODE",
"message": "Human-readable error message",
"details": {
"field": "Additional error context"
}
}
}Common Error Codes:
VALIDATION_ERROR- Invalid request dataPDF_PROCESSING_ERROR- PDF processing failedAGENT_ERROR- AI agent processing errorINTERNAL_ERROR- Server error
This backend is designed to work with the File Talk AI frontend:
Frontend Repository: File Talk AI - Frontend (React + Vite)
Make sure to configure the frontend's VITE_API_BASE_URL environment variable to point to this backend:
VITE_API_BASE_URL=http://localhost:8000python manage.py testpython manage.py createsuperuserNavigate to http://localhost:8000/admin/ after creating a superuser.
For production deployment:
- Set
ENVIRONMENT=productionin.env - Set
DEBUG=False - Configure PostgreSQL database
- Use Gunicorn as WSGI server
- Configure static file serving with WhiteNoise
- Set up proper CORS settings
- Enable rate limiting and monitoring
- Consider using Redis for caching
This project is licensed under the MIT License.
- Django - Web framework
- Django REST Framework - API toolkit
- GROQ - LLM inference platform
- Phidata - AI agent framework
- PyMuPDF - PDF processing library
Contributions are welcome! Please feel free to submit a Pull Request.
For issues and questions, please open an issue on the GitHub repository.
Built with ❤️ by Akshay Panchivala