Skip to content

Visual prompt engineering platform for creating, testing, and versioning LLM prompts across multiple providers (OpenAI, Anthropic, Mistral, Gemini).

License

Notifications You must be signed in to change notification settings

srdarkseer/PromptForge

Repository files navigation

PromptForge — Visual Prompt Engineering & Testing Platform

Create, version, and test LLM prompts visually — like Postman, but for prompts.

🎯 Why PromptForge?

Prompt engineering is becoming a critical skill as LLMs become more prevalent. However, the tooling around prompt development is still in its infancy. PromptForge aims to fill this gap by providing:

  1. Visual Development: Build complex prompts without writing code
  2. Systematic Testing: Test prompts methodically across providers
  3. Performance Tracking: Measure and optimize prompt performance
  4. Version Control: Track prompt evolution over time
  5. Production Ready: Use prompts in your applications with confidence

Whether you're building AI applications, conducting research, or just experimenting with LLMs, PromptForge provides the tools you need to engineer better prompts.

🎯 Purpose & Motivation

PromptForge was created to solve the challenges developers and AI engineers face when working with Large Language Models (LLMs). Traditional prompt engineering involves:

  • Manual testing - Copy-pasting prompts between different environments
  • No version control - Difficult to track prompt iterations and improvements
  • Limited testing - Hard to compare outputs across different LLM providers
  • No metrics - No systematic way to measure prompt quality, latency, or cost
  • Complex pipelines - Building multi-step prompt workflows is cumbersome

PromptForge aims to be the Postman for LLM prompts - providing a visual, intuitive interface for:

  • 🎨 Visual Prompt Building - Drag-and-drop interface to create complex prompt pipelines
  • 📊 Comprehensive Testing - Test prompts across multiple LLM providers (OpenAI, Anthropic, Mistral, Google Gemini)
  • 📈 Performance Metrics - Track latency, cost, quality, and similarity scores
  • 🔄 Version Control - Track prompt iterations and compare different versions
  • 🚀 Production Ready - Save and reuse prompts for your applications

🏗️ Architecture Overview

PromptForge follows a microservices architecture with clear separation of concerns:

┌─────────────────────────────────────────────────────────────┐
│                    Frontend (Next.js)                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │   Dashboard  │  │   Prompts    │  │  Test Runs   │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
│                                                             │
│  ┌──────────────────────────────────────────────────┐       │
│  │        Visual Prompt Builder (React Flow)        │       │
│  └──────────────────────────────────────────────────┘       │
└──────────────────────┬──────────────────────────────────────┘
                       │ HTTP/REST API
┌──────────────────────┴────────────────────────────────────┐
│            Node.js Backend (Express)                      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │  Prompt CRUD │  │ Test Run API │  │ Auth/Session │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
│                                                           │
│  ┌──────────────────────────────────────────────────┐     │
│  │         Queue Service (Celery Tasks)             │     │
│  └──────────────────────────────────────────────────┘     │
└──────────────────────┬────────────────────────────────────┘
                       │ HTTP API
┌──────────────────────┴────────────────────────────────────┐
│          Python Backend (FastAPI + Celery)                │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │ LLM Providers│  │   Scoring    │  │   Celery     │     │
│  │  (OpenAI,    │  │    Engine    │  │   Worker     │     │
│  │  Anthropic,  │  │              │  │              │     │
│  │  Mistral,    │  │              │  │              │     │
│  │  Gemini)     │  │              │  │              │     │
│  └──────────────┘  └──────────────┘  └──────────────┘     │
└──────────────────────┬────────────────────────────────────┘
                       │
        ┌──────────────┴──────────────┐
        │                             │
┌───────┴────────┐           ┌────────┴────────┐
│  PostgreSQL    │           │     Redis       │
│   (Database)   │           │  (Job Queue)    │
└────────────────┘           └─────────────────┘

📁 Project Structure & File Descriptions

Root Level Files

PromptForge/
├── README.md                    # This file - comprehensive project documentation
├── docker-compose.yml           # Docker Compose configuration for all services
├── Makefile                     # Development automation commands
├── package.json                 # Root package.json for workspace management
├── CONTRIBUTING.md              # Contribution guidelines
├── DUMMY_PROMPTS.md             # Sample prompts for testing
└── TEST_RUN_FLOW.md            # Documentation of test run execution flow

Why these files exist:

  • docker-compose.yml: Orchestrates all microservices (frontend, backends, database, Redis) in a single command. Enables consistent development environments across different machines.
  • Makefile: Provides convenient shortcuts for common development tasks (docker commands, database setup, etc.). Makes onboarding easier.
  • DUMMY_PROMPTS.md: Contains example prompts that developers can use to test the platform without creating their own from scratch.

Frontend (/frontend)

The frontend is a Next.js 14 application using the App Router pattern.

frontend/
├── app/                         # Next.js App Router pages (file-based routing)
│   ├── page.tsx                 # Landing page - welcomes users and shows features
│   ├── layout.tsx               # Root layout with sidebar and session provider
│   ├── globals.css              # Global styles and Tailwind CSS configuration
│   ├── middleware.ts            # Route protection - redirects unauthenticated users
│   ├── api/                     # API routes
│   │   └── auth/
│   │       └── [...nextauth]/   # NextAuth.js API route handler
│   │           └── route.ts      # Handles OAuth callbacks and session management
│   ├── auth/
│   │   └── signin/
│   │       └── page.tsx          # Sign-in page with OAuth providers (Google, GitHub)
│   ├── dashboard/
│   │   └── page.tsx             # Dashboard with stats, charts, and analytics
│   ├── prompts/
│   │   ├── page.tsx             # Prompts list page - shows all user prompts
│   │   ├── [id]/
│   │   │   └── page.tsx         # Individual prompt detail/edit page
│   │   └── builder/
│   │       └── page.tsx          # Visual prompt builder page (React Flow)
│   └── tests/
│       └── page.tsx             # Test runs list page - shows all test executions
├── components/                   # Reusable React components
│   ├── layout/
│   │   ├── Sidebar.tsx          # Main navigation sidebar (replaces navbar)
│   │   ├── ConditionalSidebar.tsx # Conditionally renders sidebar based on route
│   │   └── Navbar.tsx           # Legacy navbar (kept for reference)
│   ├── prompt-builder/
│   │   └── PromptBuilder.tsx    # React Flow visual prompt builder component
│   ├── prompt-create/
│   │   └── CreatePromptDialog.tsx # Dialog for creating new prompts
│   ├── prompt-list/
│   │   └── PromptList.tsx        # List view of all prompts with search/filter
│   ├── test-run/
│   │   ├── TestRunDialog.tsx    # Dialog for creating test runs
│   │   └── TestRunDetails.tsx   # Detailed view of test run results
│   ├── dashboard/
│   │   ├── Charts.tsx            # Recharts components for dashboard visualizations
│   │   └── StatsCard.tsx         # Reusable stat card component
│   ├── providers/
│   │   └── SessionProvider.tsx   # Client-side wrapper for NextAuth SessionProvider
│   └── ui/                      # Shadcn UI components (button, dialog, input, etc.)
├── lib/
│   ├── api.ts                   # API client - axios instance and API methods
│   ├── auth.ts                  # NextAuth configuration and helpers
│   └── utils.ts                 # Utility functions (cn helper for Tailwind)
├── types/
│   └── next-auth.d.ts           # TypeScript type extensions for NextAuth
├── package.json                 # Frontend dependencies
├── next.config.js               # Next.js configuration (transpiles recharts)
├── tailwind.config.ts           # Tailwind CSS configuration
└── tsconfig.json                # TypeScript configuration

Why this structure:

  • App Router: Next.js 13+ App Router provides better performance, server components, and file-based routing.
  • Component organization: Separated by feature (layout, prompt-builder, test-run) for better maintainability.
  • lib/: Centralized API client and utilities reduce code duplication.
  • types/: TypeScript definitions ensure type safety across the application.

Backend Node.js (/backend-node)

The Node.js backend serves as the API gateway and handles business logic.

backend-node/
├── src/
│   ├── index.ts                 # Express server entry point
│   │                            # - Sets up Express app, CORS, middleware
│   │                            # - Defines API routes (prompts, test-runs)
│   │                            # - Handles prompt CRUD operations
│   │                            # - Creates test runs and queues them
│   └── services/
│       └── queueService.ts      # Celery task queue service
│                                # - Sends test run tasks to Python backend
│                                # - Handles async job orchestration
├── prisma/
│   └── schema.prisma            # Prisma ORM schema definition
│                                # - User, Prompt, TestRun, PromptVersion models
│                                # - Database relationships and indexes
├── Dockerfile                   # Production Docker image
├── Dockerfile.dev               # Development Docker image (with hot reload)
├── package.json                 # Node.js dependencies
└── tsconfig.json                # TypeScript configuration

Why Node.js backend:

  • Type safety: TypeScript ensures type safety between frontend and backend.
  • Prisma ORM: Provides type-safe database access and migrations.
  • Express: Fast, minimal, and well-suited for REST APIs.
  • Separation of concerns: Node.js handles data persistence, Python handles LLM execution.

Backend Python (/backend-python)

The Python backend handles LLM execution and evaluation.

backend-python/
├── app/
│   ├── __init__.py              # Package initialization
│   ├── celery_app.py            # Celery application configuration
│   │                            # - Defines async task: execute_prompt_task
│   │                            # - Handles test run execution workflow
│   ├── llm_providers.py         # LLM provider implementations
│   │                            # - OpenAIProvider: OpenAI API integration
│   │                            # - AnthropicProvider: Claude API integration
│   │                            # - MistralProvider: Mistral AI integration
│   │                            # - GeminiProvider: Google Gemini integration
│   │                            # - Prompt template variable substitution
│   └── scoring_engine.py        # Evaluation metrics computation
│                                # - Similarity scoring (semantic similarity)
│                                # - Latency measurement
│                                # - Cost calculation
├── main.py                      # FastAPI application entry point
│                                # - Health check endpoint
│                                # - API documentation
├── requirements.txt             # Python dependencies
├── Dockerfile                   # Production Docker image
└── Dockerfile.dev               # Development Docker image

Why Python backend:

  • LLM libraries: Python has the best ecosystem for LLM integrations (OpenAI, Anthropic, etc.).
  • Celery: Industry-standard async task queue for Python.
  • FastAPI: Modern, fast API framework with automatic OpenAPI documentation.
  • ML libraries: Easy integration with ML libraries for scoring (sentence-transformers, scikit-learn).

Shared Types (/shared)

shared/
└── types/
    └── index.ts                 # Shared TypeScript types
                                # - Prompt interface
                                # - PromptContent interface
                                # - TestRun interface
                                # - Ensures type consistency across frontend/backend

Why shared types:

  • Type safety: Ensures frontend and backend use the same data structures.
  • Single source of truth: Changes to types are reflected everywhere.
  • Reduces bugs: TypeScript catches mismatches at compile time.

Configuration Files

├── .gitignore                   # Git ignore patterns (node_modules, .env, etc.)
├── docker-compose.yml            # Service orchestration
└── Makefile                     # Development automation

🚀 Quick Start (Docker - Recommended)

Start everything with a single command:

make docker-up

Or manually:

docker-compose up -d

This will start:

Other Docker Commands

make docker-down      # Stop all services
make docker-logs      # View logs
make docker-build     # Rebuild images
make docker-restart   # Restart services

Setup Database

After starting services for the first time:

make setup-db

This will:

  1. Generate Prisma client
  2. Run database migrations
  3. Create all necessary tables and enums

📝 Environment Variables

Create a .env file in the root directory (or set environment variables):

# Required
NEXTAUTH_SECRET=your-secret-key-here
OPENAI_API_KEY=sk-...  # Or at least one LLM provider key

# Optional - LLM Provider Keys
ANTHROPIC_API_KEY=
MISTRAL_API_KEY=
GOOGLE_API_KEY=

# Optional - OAuth (for user authentication)
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=

# Database (usually handled by docker-compose)
DATABASE_URL=postgresql://promptforge:promptforge@postgres:5432/promptforge

Frontend specific (frontend/.env.local):

NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=your-secret-key-here
NEXT_PUBLIC_API_URL=http://localhost:4000
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret

The docker-compose.yml will automatically use these environment variables.

🔧 Development

Using Docker (Recommended)

# Start all services
make docker-up

# View logs
make docker-logs

# Restart a specific service
docker-compose restart frontend

# Stop all services
make docker-down

Local Development (Without Docker)

If you prefer to run services locally:

# Install dependencies
make install

# Start services manually (4 terminals):
# Terminal 1: make dev-frontend
# Terminal 2: make dev-backend
# Terminal 3: cd backend-python && source venv/bin/activate && uvicorn main:app --reload
# Terminal 4: cd backend-python && source venv/bin/activate && celery -A app.celery_app worker --loglevel=info

🛠️ Available Commands

See all available commands:

make help

📚 Key Features

1. Prompt Management

  • Create, read, update, delete prompts
  • Version control for prompt iterations
  • Search and filter prompts
  • Template variable support ({{variable}})

2. Test Execution

  • Test prompts across multiple LLM providers
  • Compare outputs from different models
  • Async execution via Celery workers
  • Real-time status updates

3. Performance Metrics

  • Latency: Response time measurement
  • Cost: Token usage and cost calculation
  • Quality: Semantic similarity scoring
  • Analytics: Dashboard with charts and statistics

4. Authentication

  • OAuth integration (Google, GitHub)
  • Secure session management
  • Protected routes
  • User-specific prompts and test runs

🔧 Troubleshooting

Services won't start

# Check logs
make docker-logs

# Rebuild images
make docker-build

# Restart services
make docker-restart

Database issues

# Reset database
docker-compose down -v
docker-compose up -d postgres redis
make setup-db

Port conflicts

If ports 3000, 4000, 8000, 5433, or 6379 are in use, either:

  1. Stop the conflicting services
  2. Update port mappings in docker-compose.yml

Frontend not connecting to backend

  • Ensure NEXT_PUBLIC_API_URL is set correctly
  • Check that backend-node is running on port 4000
  • Verify CORS is configured in backend-node/src/index.ts

Test runs stuck in PENDING

  • Check Celery worker logs: docker-compose logs celery-worker
  • Verify Redis is running: docker-compose ps redis
  • Ensure LLM API keys are set in environment variables

🧪 Testing

Create Sample Prompts

Use the examples in DUMMY_PROMPTS.md to create test prompts.

Run Test Cases

  1. Create a prompt in the UI
  2. Go to Test Runs page
  3. Click "Test Run" button
  4. Select prompt and provide test input (JSON format)
  5. View results with metrics

📚 Additional Documentation

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

📝 License

This project is licensed under the MIT License.

👤 Author

Sushant R. Dangal

About

Visual prompt engineering platform for creating, testing, and versioning LLM prompts across multiple providers (OpenAI, Anthropic, Mistral, Gemini).

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published