Create, version, and test LLM prompts visually — like Postman, but for prompts.
Prompt engineering is becoming a critical skill as LLMs become more prevalent. However, the tooling around prompt development is still in its infancy. PromptForge aims to fill this gap by providing:
- Visual Development: Build complex prompts without writing code
- Systematic Testing: Test prompts methodically across providers
- Performance Tracking: Measure and optimize prompt performance
- Version Control: Track prompt evolution over time
- Production Ready: Use prompts in your applications with confidence
Whether you're building AI applications, conducting research, or just experimenting with LLMs, PromptForge provides the tools you need to engineer better prompts.
PromptForge was created to solve the challenges developers and AI engineers face when working with Large Language Models (LLMs). Traditional prompt engineering involves:
- Manual testing - Copy-pasting prompts between different environments
- No version control - Difficult to track prompt iterations and improvements
- Limited testing - Hard to compare outputs across different LLM providers
- No metrics - No systematic way to measure prompt quality, latency, or cost
- Complex pipelines - Building multi-step prompt workflows is cumbersome
PromptForge aims to be the Postman for LLM prompts - providing a visual, intuitive interface for:
- 🎨 Visual Prompt Building - Drag-and-drop interface to create complex prompt pipelines
- 📊 Comprehensive Testing - Test prompts across multiple LLM providers (OpenAI, Anthropic, Mistral, Google Gemini)
- 📈 Performance Metrics - Track latency, cost, quality, and similarity scores
- 🔄 Version Control - Track prompt iterations and compare different versions
- 🚀 Production Ready - Save and reuse prompts for your applications
PromptForge follows a microservices architecture with clear separation of concerns:
┌─────────────────────────────────────────────────────────────┐
│ Frontend (Next.js) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Dashboard │ │ Prompts │ │ Test Runs │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Visual Prompt Builder (React Flow) │ │
│ └──────────────────────────────────────────────────┘ │
└──────────────────────┬──────────────────────────────────────┘
│ HTTP/REST API
┌──────────────────────┴────────────────────────────────────┐
│ Node.js Backend (Express) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Prompt CRUD │ │ Test Run API │ │ Auth/Session │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Queue Service (Celery Tasks) │ │
│ └──────────────────────────────────────────────────┘ │
└──────────────────────┬────────────────────────────────────┘
│ HTTP API
┌──────────────────────┴────────────────────────────────────┐
│ Python Backend (FastAPI + Celery) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ LLM Providers│ │ Scoring │ │ Celery │ │
│ │ (OpenAI, │ │ Engine │ │ Worker │ │
│ │ Anthropic, │ │ │ │ │ │
│ │ Mistral, │ │ │ │ │ │
│ │ Gemini) │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────┬────────────────────────────────────┘
│
┌──────────────┴──────────────┐
│ │
┌───────┴────────┐ ┌────────┴────────┐
│ PostgreSQL │ │ Redis │
│ (Database) │ │ (Job Queue) │
└────────────────┘ └─────────────────┘
PromptForge/
├── README.md # This file - comprehensive project documentation
├── docker-compose.yml # Docker Compose configuration for all services
├── Makefile # Development automation commands
├── package.json # Root package.json for workspace management
├── CONTRIBUTING.md # Contribution guidelines
├── DUMMY_PROMPTS.md # Sample prompts for testing
└── TEST_RUN_FLOW.md # Documentation of test run execution flow
Why these files exist:
- docker-compose.yml: Orchestrates all microservices (frontend, backends, database, Redis) in a single command. Enables consistent development environments across different machines.
- Makefile: Provides convenient shortcuts for common development tasks (docker commands, database setup, etc.). Makes onboarding easier.
- DUMMY_PROMPTS.md: Contains example prompts that developers can use to test the platform without creating their own from scratch.
The frontend is a Next.js 14 application using the App Router pattern.
frontend/
├── app/ # Next.js App Router pages (file-based routing)
│ ├── page.tsx # Landing page - welcomes users and shows features
│ ├── layout.tsx # Root layout with sidebar and session provider
│ ├── globals.css # Global styles and Tailwind CSS configuration
│ ├── middleware.ts # Route protection - redirects unauthenticated users
│ ├── api/ # API routes
│ │ └── auth/
│ │ └── [...nextauth]/ # NextAuth.js API route handler
│ │ └── route.ts # Handles OAuth callbacks and session management
│ ├── auth/
│ │ └── signin/
│ │ └── page.tsx # Sign-in page with OAuth providers (Google, GitHub)
│ ├── dashboard/
│ │ └── page.tsx # Dashboard with stats, charts, and analytics
│ ├── prompts/
│ │ ├── page.tsx # Prompts list page - shows all user prompts
│ │ ├── [id]/
│ │ │ └── page.tsx # Individual prompt detail/edit page
│ │ └── builder/
│ │ └── page.tsx # Visual prompt builder page (React Flow)
│ └── tests/
│ └── page.tsx # Test runs list page - shows all test executions
├── components/ # Reusable React components
│ ├── layout/
│ │ ├── Sidebar.tsx # Main navigation sidebar (replaces navbar)
│ │ ├── ConditionalSidebar.tsx # Conditionally renders sidebar based on route
│ │ └── Navbar.tsx # Legacy navbar (kept for reference)
│ ├── prompt-builder/
│ │ └── PromptBuilder.tsx # React Flow visual prompt builder component
│ ├── prompt-create/
│ │ └── CreatePromptDialog.tsx # Dialog for creating new prompts
│ ├── prompt-list/
│ │ └── PromptList.tsx # List view of all prompts with search/filter
│ ├── test-run/
│ │ ├── TestRunDialog.tsx # Dialog for creating test runs
│ │ └── TestRunDetails.tsx # Detailed view of test run results
│ ├── dashboard/
│ │ ├── Charts.tsx # Recharts components for dashboard visualizations
│ │ └── StatsCard.tsx # Reusable stat card component
│ ├── providers/
│ │ └── SessionProvider.tsx # Client-side wrapper for NextAuth SessionProvider
│ └── ui/ # Shadcn UI components (button, dialog, input, etc.)
├── lib/
│ ├── api.ts # API client - axios instance and API methods
│ ├── auth.ts # NextAuth configuration and helpers
│ └── utils.ts # Utility functions (cn helper for Tailwind)
├── types/
│ └── next-auth.d.ts # TypeScript type extensions for NextAuth
├── package.json # Frontend dependencies
├── next.config.js # Next.js configuration (transpiles recharts)
├── tailwind.config.ts # Tailwind CSS configuration
└── tsconfig.json # TypeScript configuration
Why this structure:
- App Router: Next.js 13+ App Router provides better performance, server components, and file-based routing.
- Component organization: Separated by feature (layout, prompt-builder, test-run) for better maintainability.
- lib/: Centralized API client and utilities reduce code duplication.
- types/: TypeScript definitions ensure type safety across the application.
The Node.js backend serves as the API gateway and handles business logic.
backend-node/
├── src/
│ ├── index.ts # Express server entry point
│ │ # - Sets up Express app, CORS, middleware
│ │ # - Defines API routes (prompts, test-runs)
│ │ # - Handles prompt CRUD operations
│ │ # - Creates test runs and queues them
│ └── services/
│ └── queueService.ts # Celery task queue service
│ # - Sends test run tasks to Python backend
│ # - Handles async job orchestration
├── prisma/
│ └── schema.prisma # Prisma ORM schema definition
│ # - User, Prompt, TestRun, PromptVersion models
│ # - Database relationships and indexes
├── Dockerfile # Production Docker image
├── Dockerfile.dev # Development Docker image (with hot reload)
├── package.json # Node.js dependencies
└── tsconfig.json # TypeScript configuration
Why Node.js backend:
- Type safety: TypeScript ensures type safety between frontend and backend.
- Prisma ORM: Provides type-safe database access and migrations.
- Express: Fast, minimal, and well-suited for REST APIs.
- Separation of concerns: Node.js handles data persistence, Python handles LLM execution.
The Python backend handles LLM execution and evaluation.
backend-python/
├── app/
│ ├── __init__.py # Package initialization
│ ├── celery_app.py # Celery application configuration
│ │ # - Defines async task: execute_prompt_task
│ │ # - Handles test run execution workflow
│ ├── llm_providers.py # LLM provider implementations
│ │ # - OpenAIProvider: OpenAI API integration
│ │ # - AnthropicProvider: Claude API integration
│ │ # - MistralProvider: Mistral AI integration
│ │ # - GeminiProvider: Google Gemini integration
│ │ # - Prompt template variable substitution
│ └── scoring_engine.py # Evaluation metrics computation
│ # - Similarity scoring (semantic similarity)
│ # - Latency measurement
│ # - Cost calculation
├── main.py # FastAPI application entry point
│ # - Health check endpoint
│ # - API documentation
├── requirements.txt # Python dependencies
├── Dockerfile # Production Docker image
└── Dockerfile.dev # Development Docker image
Why Python backend:
- LLM libraries: Python has the best ecosystem for LLM integrations (OpenAI, Anthropic, etc.).
- Celery: Industry-standard async task queue for Python.
- FastAPI: Modern, fast API framework with automatic OpenAPI documentation.
- ML libraries: Easy integration with ML libraries for scoring (sentence-transformers, scikit-learn).
shared/
└── types/
└── index.ts # Shared TypeScript types
# - Prompt interface
# - PromptContent interface
# - TestRun interface
# - Ensures type consistency across frontend/backend
Why shared types:
- Type safety: Ensures frontend and backend use the same data structures.
- Single source of truth: Changes to types are reflected everywhere.
- Reduces bugs: TypeScript catches mismatches at compile time.
├── .gitignore # Git ignore patterns (node_modules, .env, etc.)
├── docker-compose.yml # Service orchestration
└── Makefile # Development automation
Start everything with a single command:
make docker-upOr manually:
docker-compose up -dThis will start:
- Frontend (http://localhost:3000)
- Node.js Backend API (http://localhost:4000)
- Python Backend API (http://localhost:8000)
- Celery Worker (background tasks)
- PostgreSQL (database) - Port 5433
- Redis (job queue)
make docker-down # Stop all services
make docker-logs # View logs
make docker-build # Rebuild images
make docker-restart # Restart servicesAfter starting services for the first time:
make setup-dbThis will:
- Generate Prisma client
- Run database migrations
- Create all necessary tables and enums
Create a .env file in the root directory (or set environment variables):
# Required
NEXTAUTH_SECRET=your-secret-key-here
OPENAI_API_KEY=sk-... # Or at least one LLM provider key
# Optional - LLM Provider Keys
ANTHROPIC_API_KEY=
MISTRAL_API_KEY=
GOOGLE_API_KEY=
# Optional - OAuth (for user authentication)
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
# Database (usually handled by docker-compose)
DATABASE_URL=postgresql://promptforge:promptforge@postgres:5432/promptforgeFrontend specific (frontend/.env.local):
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=your-secret-key-here
NEXT_PUBLIC_API_URL=http://localhost:4000
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secretThe docker-compose.yml will automatically use these environment variables.
# Start all services
make docker-up
# View logs
make docker-logs
# Restart a specific service
docker-compose restart frontend
# Stop all services
make docker-downIf you prefer to run services locally:
# Install dependencies
make install
# Start services manually (4 terminals):
# Terminal 1: make dev-frontend
# Terminal 2: make dev-backend
# Terminal 3: cd backend-python && source venv/bin/activate && uvicorn main:app --reload
# Terminal 4: cd backend-python && source venv/bin/activate && celery -A app.celery_app worker --loglevel=infoSee all available commands:
make help- Create, read, update, delete prompts
- Version control for prompt iterations
- Search and filter prompts
- Template variable support (
{{variable}})
- Test prompts across multiple LLM providers
- Compare outputs from different models
- Async execution via Celery workers
- Real-time status updates
- Latency: Response time measurement
- Cost: Token usage and cost calculation
- Quality: Semantic similarity scoring
- Analytics: Dashboard with charts and statistics
- OAuth integration (Google, GitHub)
- Secure session management
- Protected routes
- User-specific prompts and test runs
# Check logs
make docker-logs
# Rebuild images
make docker-build
# Restart services
make docker-restart# Reset database
docker-compose down -v
docker-compose up -d postgres redis
make setup-dbIf ports 3000, 4000, 8000, 5433, or 6379 are in use, either:
- Stop the conflicting services
- Update port mappings in
docker-compose.yml
- Ensure
NEXT_PUBLIC_API_URLis set correctly - Check that backend-node is running on port 4000
- Verify CORS is configured in
backend-node/src/index.ts
- Check Celery worker logs:
docker-compose logs celery-worker - Verify Redis is running:
docker-compose ps redis - Ensure LLM API keys are set in environment variables
Use the examples in DUMMY_PROMPTS.md to create test prompts.
- Create a prompt in the UI
- Go to Test Runs page
- Click "Test Run" button
- Select prompt and provide test input (JSON format)
- View results with metrics
- Frontend README - Frontend-specific documentation
- Node.js Backend README - Backend API documentation
- Python Backend README - LLM execution engine docs
- Contributing Guidelines - How to contribute
- Dummy Prompts - Example prompts for testing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
This project is licensed under the MIT License.
Sushant R. Dangal