A powerful FastAPI backend that provides intelligent deep research capabilities using multiple AI models (OpenAI, Anthropic, Kimi K2). Built with LangGraph for complex research workflows and designed for deployment on GCP Cloud Run.
πΉ Full YouTube Guide: Youtube link
π X Post: X link
π» Launch Full Stack Product: Github Repo
βοΈ Buy me a coffee: Cafe Latte
π€οΈ Discord: Invite link
- Multi-Model Support: OpenAI GPT-4o, Anthropic Claude, and Kimi K2 0905
- Streaming Research: Real-time progress updates with detailed step visibility
- LangGraph Integration: Complex research workflows with clarification, briefing, execution, and reporting
- Cloud Ready: Docker containerized for GCP Cloud Run deployment
- Secure API Keys: Environment-based configuration with user-provided keys
- Health Monitoring: Built-in health checks and metrics collection
- Python 3.11+
- Docker (for containerization)
- GCP Account (for deployment)
-
Clone the repository
git clone https://github.com/ShenSeanChen/yt-DeepResearch-Backend.git cd yt-DeepResearch-Backend
-
Install dependencies
# Create a virtual environment (you can name it .venv or venv) python3 -m venv venv # Activate the virtual environment source venv/bin/activate # On macOS/Linux # .\venv\Scripts\activate # On Windows (PowerShell) # Install dependencies pip install -r requirements.txt
-
Set up environment variables
cp .env.example .env # Edit .env with your configuration (optional - API keys can be provided via frontend)
-
Run the development server
uvicorn main:app --host 0.0.0.0 --port 8080 --reload
-
Test the API
curl http://localhost:8080/health
main.py
: FastAPI application with streaming endpointsservices/deep_research_service.py
: Core research logic with LangGraph integrationservices/model_service.py
: AI model management and configurationmodels/research_models.py
: Pydantic models for API contractsopen_deep_research/
: Research agent implementation
- Clarification: Optional query clarification with user
- Research Brief: Generate research strategy and plan
- Research Execution: Conduct multi-source research with real-time updates
- Final Report: Synthesize findings into comprehensive report
Model | Provider | Configuration |
---|---|---|
openai |
OpenAI | GPT-4o with 128k context |
anthropic |
Anthropic | Claude-3-5-Sonnet |
kimi |
Moonshot AI | K2 0905 via Anthropic API |
# Optional - API keys can be provided via frontend
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
# For Kimi K2 (uses Anthropic API format)
ANTHROPIC_API_KEY=your_kimi_key # When using Kimi
ANTHROPIC_BASE_URL=https://api.moonshot.ai/anthropic # Auto-configured for Kimi
GET /health
HEAD /health # For Cloud Run health checks
POST /research/stream
Content-Type: application/json
{
"query": "Your research question",
"model": "anthropic", # openai, anthropic, or kimi
"api_key": "your_api_key"
}
data: {"type": "session_start", "content": "Starting research..."}
data: {"type": "stage_start", "stage": "clarification", "content": "..."}
data: {"type": "research_step", "stage": "research_planning", "content": "π― Planning research strategy..."}
data: {"type": "research_finding", "content": "π Research Finding 1: ..."}
data: {"type": "research_complete", "content": "Final report content"}
docker build -t deep-research-backend .
docker run -p 8080:8080 deep-research-backend
- GCP CLI installed and authenticated
- Docker configured for GCP
# Build and push to Google Container Registry
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/deep-research-backend
# Deploy to Cloud Run - BEST PRACTICE: Optimized for long-running research
gcloud run deploy deep-research-backend \
--image gcr.io/YOUR_PROJECT_ID/deep-research-backend \
--platform managed \
--region europe-west1 \
--allow-unauthenticated \
--memory 4Gi \
--cpu 2 \
--timeout 3600s \
--max-instances 5 \
--min-instances 1 \
--concurrency 10
Set these in the GCP Console under Cloud Run > Service > Edit & Deploy New Revision > Variables:
GET_API_KEYS_FROM_CONFIG=true
(enables user-provided API keys)
The backend is designed to work with the Deep Research Frontend.
In your frontend .env.local
:
# For local development
NEXT_PUBLIC_BACKEND_URL=http://localhost:8080
# For production
NEXT_PUBLIC_BACKEND_URL=https://your-backend-url.run.app
# Health check
curl -X GET http://localhost:8080/health
# Research stream (replace with your API key)
curl -X POST http://localhost:8080/research/stream \
-H "Content-Type: application/json" \
-d '{
"query": "What are the latest developments in AI?",
"model": "anthropic",
"api_key": "your_api_key"
}'
# Test Kimi K2 integration
python test_kimi_model.py
- Health checks available at
/health
- Structured logging for debugging
- Request/response metrics collection
- Error tracking and reporting
- API keys never stored server-side
- CORS configured for frontend origins
- Input validation with Pydantic models
- Rate limiting and timeout protection
-
HTTP 405 Errors in GCP Logs
- Fixed: HEAD method support added for health checks
-
Token Limit Errors
- Solution: Upgraded to GPT-4o (128k context) for OpenAI
-
Kimi K2 Connection Issues
- Check base URL:
https://api.moonshot.ai/anthropic
- Verify API key format matches Anthropic
- Check base URL:
-
Streaming Interruptions
- Keep-alive headers configured
- Timeout set to 300s for long research
MIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
- Create an issue for bugs or feature requests
- Check existing issues for solutions
- Review logs for debugging information
Built with β€οΈ using FastAPI, LangGraph, and modern AI models.