PromptForge — Visual Prompt Engineering & Testing Platform

Create, version, and test LLM prompts visually — like Postman, but for prompts.

🎯 Why PromptForge?

Prompt engineering is becoming a critical skill as LLMs become more prevalent. However, the tooling around prompt development is still in its infancy. PromptForge aims to fill this gap by providing:

Visual Development: Build complex prompts without writing code
Systematic Testing: Test prompts methodically across providers
Performance Tracking: Measure and optimize prompt performance
Version Control: Track prompt evolution over time
Production Ready: Use prompts in your applications with confidence

Whether you're building AI applications, conducting research, or just experimenting with LLMs, PromptForge provides the tools you need to engineer better prompts.

🎯 Purpose & Motivation

PromptForge was created to solve the challenges developers and AI engineers face when working with Large Language Models (LLMs). Traditional prompt engineering involves:

Manual testing - Copy-pasting prompts between different environments
No version control - Difficult to track prompt iterations and improvements
Limited testing - Hard to compare outputs across different LLM providers
No metrics - No systematic way to measure prompt quality, latency, or cost
Complex pipelines - Building multi-step prompt workflows is cumbersome

PromptForge aims to be the Postman for LLM prompts - providing a visual, intuitive interface for:

🎨 Visual Prompt Building - Drag-and-drop interface to create complex prompt pipelines
📊 Comprehensive Testing - Test prompts across multiple LLM providers (OpenAI, Anthropic, Mistral, Google Gemini)
📈 Performance Metrics - Track latency, cost, quality, and similarity scores
🔄 Version Control - Track prompt iterations and compare different versions
🚀 Production Ready - Save and reuse prompts for your applications

🏗️ Architecture Overview

PromptForge follows a microservices architecture with clear separation of concerns:

┌─────────────────────────────────────────────────────────────┐
│                    Frontend (Next.js)                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │   Dashboard  │  │   Prompts    │  │  Test Runs   │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
│                                                             │
│  ┌──────────────────────────────────────────────────┐       │
│  │        Visual Prompt Builder (React Flow)        │       │
│  └──────────────────────────────────────────────────┘       │
└──────────────────────┬──────────────────────────────────────┘
                       │ HTTP/REST API
┌──────────────────────┴────────────────────────────────────┐
│            Node.js Backend (Express)                      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │  Prompt CRUD │  │ Test Run API │  │ Auth/Session │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
│                                                           │
│  ┌──────────────────────────────────────────────────┐     │
│  │         Queue Service (Celery Tasks)             │     │
│  └──────────────────────────────────────────────────┘     │
└──────────────────────┬────────────────────────────────────┘
                       │ HTTP API
┌──────────────────────┴────────────────────────────────────┐
│          Python Backend (FastAPI + Celery)                │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │ LLM Providers│  │   Scoring    │  │   Celery     │     │
│  │  (OpenAI,    │  │    Engine    │  │   Worker     │     │
│  │  Anthropic,  │  │              │  │              │     │
│  │  Mistral,    │  │              │  │              │     │
│  │  Gemini)     │  │              │  │              │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
└──────────────────────┬────────────────────────────────────┘
                       │
        ┌──────────────┴──────────────┐
        │                             │
┌───────┴────────┐           ┌────────┴────────┐
│  PostgreSQL    │           │     Redis       │
│   (Database)   │           │  (Job Queue)    │
└────────────────┘           └─────────────────┘

📁 Project Structure & File Descriptions

Root Level Files

PromptForge/
├── README.md                    # This file - comprehensive project documentation
├── docker-compose.yml           # Docker Compose configuration for all services
├── Makefile                     # Development automation commands
├── package.json                 # Root package.json for workspace management
├── CONTRIBUTING.md              # Contribution guidelines
├── DUMMY_PROMPTS.md             # Sample prompts for testing
└── TEST_RUN_FLOW.md            # Documentation of test run execution flow

Why these files exist:

docker-compose.yml: Orchestrates all microservices (frontend, backends, database, Redis) in a single command. Enables consistent development environments across different machines.
Makefile: Provides convenient shortcuts for common development tasks (docker commands, database setup, etc.). Makes onboarding easier.
DUMMY_PROMPTS.md: Contains example prompts that developers can use to test the platform without creating their own from scratch.

Frontend (`/frontend`)

The frontend is a Next.js 14 application using the App Router pattern.

frontend/
├── app/                         # Next.js App Router pages (file-based routing)
│   ├── page.tsx                 # Landing page - welcomes users and shows features
│   ├── layout.tsx               # Root layout with sidebar and session provider
│   ├── globals.css              # Global styles and Tailwind CSS configuration
│   ├── middleware.ts            # Route protection - redirects unauthenticated users
│   ├── api/                     # API routes
│   │   └── auth/
│   │       └── [...nextauth]/   # NextAuth.js API route handler
│   │           └── route.ts      # Handles OAuth callbacks and session management
│   ├── auth/
│   │   └── signin/
│   │       └── page.tsx          # Sign-in page with OAuth providers (Google, GitHub)
│   ├── dashboard/
│   │   └── page.tsx             # Dashboard with stats, charts, and analytics
│   ├── prompts/
│   │   ├── page.tsx             # Prompts list page - shows all user prompts
│   │   ├── [id]/
│   │   │   └── page.tsx         # Individual prompt detail/edit page
│   │   └── builder/
│   │       └── page.tsx          # Visual prompt builder page (React Flow)
│   └── tests/
│       └── page.tsx             # Test runs list page - shows all test executions
├── components/                   # Reusable React components
│   ├── layout/
│   │   ├── Sidebar.tsx          # Main navigation sidebar (replaces navbar)
│   │   ├── ConditionalSidebar.tsx # Conditionally renders sidebar based on route
│   │   └── Navbar.tsx           # Legacy navbar (kept for reference)
│   ├── prompt-builder/
│   │   └── PromptBuilder.tsx    # React Flow visual prompt builder component
│   ├── prompt-create/
│   │   └── CreatePromptDialog.tsx # Dialog for creating new prompts
│   ├── prompt-list/
│   │   └── PromptList.tsx        # List view of all prompts with search/filter
│   ├── test-run/
│   │   ├── TestRunDialog.tsx    # Dialog for creating test runs
│   │   └── TestRunDetails.tsx   # Detailed view of test run results
│   ├── dashboard/
│   │   ├── Charts.tsx            # Recharts components for dashboard visualizations
│   │   └── StatsCard.tsx         # Reusable stat card component
│   ├── providers/
│   │   └── SessionProvider.tsx   # Client-side wrapper for NextAuth SessionProvider
│   └── ui/                      # Shadcn UI components (button, dialog, input, etc.)
├── lib/
│   ├── api.ts                   # API client - axios instance and API methods
│   ├── auth.ts                  # NextAuth configuration and helpers
│   └── utils.ts                 # Utility functions (cn helper for Tailwind)
├── types/
│   └── next-auth.d.ts           # TypeScript type extensions for NextAuth
├── package.json                 # Frontend dependencies
├── next.config.js               # Next.js configuration (transpiles recharts)
├── tailwind.config.ts           # Tailwind CSS configuration
└── tsconfig.json                # TypeScript configuration

Why this structure:

App Router: Next.js 13+ App Router provides better performance, server components, and file-based routing.
Component organization: Separated by feature (layout, prompt-builder, test-run) for better maintainability.
lib/: Centralized API client and utilities reduce code duplication.
types/: TypeScript definitions ensure type safety across the application.

Backend Node.js (`/backend-node`)

The Node.js backend serves as the API gateway and handles business logic.

backend-node/
├── src/
│   ├── index.ts                 # Express server entry point
│   │                            # - Sets up Express app, CORS, middleware
│   │                            # - Defines API routes (prompts, test-runs)
│   │                            # - Handles prompt CRUD operations
│   │                            # - Creates test runs and queues them
│   └── services/
│       └── queueService.ts      # Celery task queue service
│                                # - Sends test run tasks to Python backend
│                                # - Handles async job orchestration
├── prisma/
│   └── schema.prisma            # Prisma ORM schema definition
│                                # - User, Prompt, TestRun, PromptVersion models
│                                # - Database relationships and indexes
├── Dockerfile                   # Production Docker image
├── Dockerfile.dev               # Development Docker image (with hot reload)
├── package.json                 # Node.js dependencies
└── tsconfig.json                # TypeScript configuration

Why Node.js backend:

Type safety: TypeScript ensures type safety between frontend and backend.
Prisma ORM: Provides type-safe database access and migrations.
Express: Fast, minimal, and well-suited for REST APIs.
Separation of concerns: Node.js handles data persistence, Python handles LLM execution.

Backend Python (`/backend-python`)

The Python backend handles LLM execution and evaluation.

backend-python/
├── app/
│   ├── __init__.py              # Package initialization
│   ├── celery_app.py            # Celery application configuration
│   │                            # - Defines async task: execute_prompt_task
│   │                            # - Handles test run execution workflow
│   ├── llm_providers.py         # LLM provider implementations
│   │                            # - OpenAIProvider: OpenAI API integration
│   │                            # - AnthropicProvider: Claude API integration
│   │                            # - MistralProvider: Mistral AI integration
│   │                            # - GeminiProvider: Google Gemini integration
│   │                            # - Prompt template variable substitution
│   └── scoring_engine.py        # Evaluation metrics computation
│                                # - Similarity scoring (semantic similarity)
│                                # - Latency measurement
│                                # - Cost calculation
├── main.py                      # FastAPI application entry point
│                                # - Health check endpoint
│                                # - API documentation
├── requirements.txt             # Python dependencies
├── Dockerfile                   # Production Docker image
└── Dockerfile.dev               # Development Docker image

Why Python backend:

LLM libraries: Python has the best ecosystem for LLM integrations (OpenAI, Anthropic, etc.).
Celery: Industry-standard async task queue for Python.
FastAPI: Modern, fast API framework with automatic OpenAPI documentation.
ML libraries: Easy integration with ML libraries for scoring (sentence-transformers, scikit-learn).

Shared Types (`/shared`)

shared/
└── types/
    └── index.ts                 # Shared TypeScript types
                                # - Prompt interface
                                # - PromptContent interface
                                # - TestRun interface
                                # - Ensures type consistency across frontend/backend

Why shared types:

Type safety: Ensures frontend and backend use the same data structures.
Single source of truth: Changes to types are reflected everywhere.
Reduces bugs: TypeScript catches mismatches at compile time.

Configuration Files

├── .gitignore                   # Git ignore patterns (node_modules, .env, etc.)
├── docker-compose.yml            # Service orchestration
└── Makefile                     # Development automation

🚀 Quick Start (Docker - Recommended)

Start everything with a single command:

make docker-up

Or manually:

docker-compose up -d

This will start:

Frontend (http://localhost:3000)
Node.js Backend API (http://localhost:4000)
Python Backend API (http://localhost:8000)
Celery Worker (background tasks)
PostgreSQL (database) - Port 5433
Redis (job queue)

Other Docker Commands

make docker-down      # Stop all services
make docker-logs      # View logs
make docker-build     # Rebuild images
make docker-restart   # Restart services

Setup Database

After starting services for the first time:

make setup-db

This will:

Generate Prisma client
Run database migrations
Create all necessary tables and enums

📝 Environment Variables

Create a .env file in the root directory (or set environment variables):

# Required
NEXTAUTH_SECRET=your-secret-key-here
OPENAI_API_KEY=sk-...  # Or at least one LLM provider key

# Optional - LLM Provider Keys
ANTHROPIC_API_KEY=
MISTRAL_API_KEY=
GOOGLE_API_KEY=

# Optional - OAuth (for user authentication)
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=

# Database (usually handled by docker-compose)
DATABASE_URL=postgresql://promptforge:promptforge@postgres:5432/promptforge

Frontend specific (frontend/.env.local):

NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=your-secret-key-here
NEXT_PUBLIC_API_URL=http://localhost:4000
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret

The docker-compose.yml will automatically use these environment variables.

🔧 Development

Using Docker (Recommended)

# Start all services
make docker-up

# View logs
make docker-logs

# Restart a specific service
docker-compose restart frontend

# Stop all services
make docker-down

Local Development (Without Docker)

If you prefer to run services locally:

# Install dependencies
make install

# Start services manually (4 terminals):
# Terminal 1: make dev-frontend
# Terminal 2: make dev-backend
# Terminal 3: cd backend-python && source venv/bin/activate && uvicorn main:app --reload
# Terminal 4: cd backend-python && source venv/bin/activate && celery -A app.celery_app worker --loglevel=info

🛠️ Available Commands

See all available commands:

make help

📚 Key Features

1. Prompt Management

Create, read, update, delete prompts
Version control for prompt iterations
Search and filter prompts
Template variable support ({{variable}})

2. Test Execution

Test prompts across multiple LLM providers
Compare outputs from different models
Async execution via Celery workers
Real-time status updates

3. Performance Metrics

Latency: Response time measurement
Cost: Token usage and cost calculation
Quality: Semantic similarity scoring
Analytics: Dashboard with charts and statistics

4. Authentication

OAuth integration (Google, GitHub)
Secure session management
Protected routes
User-specific prompts and test runs

🔧 Troubleshooting

Services won't start

# Check logs
make docker-logs

# Rebuild images
make docker-build

# Restart services
make docker-restart

Database issues

# Reset database
docker-compose down -v
docker-compose up -d postgres redis
make setup-db

Port conflicts

If ports 3000, 4000, 8000, 5433, or 6379 are in use, either:

Stop the conflicting services
Update port mappings in docker-compose.yml

Frontend not connecting to backend

Ensure NEXT_PUBLIC_API_URL is set correctly
Check that backend-node is running on port 4000
Verify CORS is configured in backend-node/src/index.ts

Test runs stuck in PENDING

Check Celery worker logs: docker-compose logs celery-worker
Verify Redis is running: docker-compose ps redis
Ensure LLM API keys are set in environment variables

🧪 Testing

Create Sample Prompts

Use the examples in DUMMY_PROMPTS.md to create test prompts.

Run Test Cases

Create a prompt in the UI
Go to Test Runs page
Click "Test Run" button
Select prompt and provide test input (JSON format)
View results with metrics

📚 Additional Documentation

Frontend README - Frontend-specific documentation
Node.js Backend README - Backend API documentation
Python Backend README - LLM execution engine docs
Contributing Guidelines - How to contribute
Dummy Prompts - Example prompts for testing

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

📝 License

This project is licensed under the MIT License.

👤 Author

Sushant R. Dangal

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
backend-node		backend-node
backend-python		backend-python
frontend		frontend
shared/types		shared/types
.commitlintrc.json		.commitlintrc.json
.dockerignore		.dockerignore
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DUMMY_PROMPTS.md		DUMMY_PROMPTS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TEST_RUN_FLOW.md		TEST_RUN_FLOW.md
create-dummy-prompts.sh		create-dummy-prompts.sh
docker-compose-fix.yml		docker-compose-fix.yml
docker-compose.yml		docker-compose.yml
package.json		package.json
test-connection.sh		test-connection.sh

License

srdarkseer/PromptForge

Folders and files

Latest commit

History

Repository files navigation