Skip to content

dishamurthy/llm-api-gateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ LLM API Gateway

A production-grade Node.js API gateway for managing LLM (Large Language Model) requests with support for multiple providers including Groq and Ollama. Built with enterprise-level middleware for authentication, rate limiting, logging, and validation.

✨ Features

  • Multi-Provider Support: Integrates with Groq (Llama 3.3 70B) and Ollama
  • Authentication Middleware: API key-based authentication
  • Rate Limiting: Prevent abuse with configurable rate limits
  • Request Logging: Comprehensive logging for debugging and monitoring
  • Input Validation: Schema-based validation for requests
  • Error Handling: Structured error responses with proper HTTP status codes
  • Health Checks: Monitor provider availability and latency
  • TypeScript-ready: Well-documented code with JSDoc comments

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client    β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Express App (app.js)           β”‚
β”‚                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Middleware Stack        β”‚  β”‚
β”‚  β”‚  1. Logger               β”‚  β”‚
β”‚  β”‚  2. Auth (API Key)       β”‚  β”‚
β”‚  β”‚  3. Rate Limiter         β”‚  β”‚
β”‚  β”‚  4. Validator (routes)   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Routes                  β”‚  β”‚
β”‚  β”‚  - /api/chat             β”‚  β”‚
β”‚  β”‚  - /test                 β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Provider Layer                 β”‚
β”‚  β”œβ”€ Groq (Llama 3.3 70B)       β”‚
β”‚  └─ Ollama (Local)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

llm-api-gateway/
β”œβ”€β”€ app.js                 # Main Express application
β”œβ”€β”€ middleware/
β”‚   β”œβ”€β”€ auth.js           # API key authentication
β”‚   β”œβ”€β”€ logger.js         # Request logging
β”‚   β”œβ”€β”€ rateLimit.js      # Rate limiting
β”‚   └── validator.js      # Input validation
β”œβ”€β”€ providers/
β”‚   β”œβ”€β”€ groq.js           # Groq API integration
β”‚   └── ollama.js         # Ollama integration
β”œβ”€β”€ routes/
β”‚   └── chat.js           # Chat endpoints
β”œβ”€β”€ utils/
β”‚   └── errors.js         # Custom error classes
β”œβ”€β”€ package.json
β”œβ”€β”€ .env.example
└── README.md

🚦 Getting Started

Prerequisites

  • Node.js (v16+)
  • npm or yarn
  • Groq API key (get from console.groq.com)
  • Ollama (optional, for local models)

Installation

  1. Clone the repository

    git clone https://github.com/dishamurthy/llm-api-gateway.git
    cd llm-api-gateway
  2. Install dependencies

    npm install
  3. Set up environment variables

    Create a .env file in the root directory:

    # Server Configuration
    PORT=3000
    NODE_ENV=development
    
    # API Gateway Authentication
    API_KEY=your-secret-api-key-here
    
    # Groq Configuration
    GROQ_API_KEY=your-groq-api-key-here
    
    # Ollama Configuration (optional)
    OLLAMA_BASE_URL=http://localhost:11434
  4. Start the server

    npm start

    The server will start on http://localhost:3000

πŸ“‘ API Endpoints

Health Check

GET /

Response:

{
  "message": "AI Gateway API",
  "status": "running",
  "timestamp": "2026-02-12T15:13:35.000Z"
}

Test Authentication

GET /test
Headers:
  x-api-key: your-secret-api-key

Response:

{
  "message": "Test endpoint",
  "user": {
    "id": "user_123",
    "name": "Test User",
    "tier": "free"
  }
}

Chat Completion

POST /api/chat
Headers:
  x-api-key: your-secret-api-key
  Content-Type: application/json

Body:
{
  "message": "What is the capital of France?",
  "provider": "groq",
  "options": {
    "model": "llama-3.3-70b-versatile",
    "maxTokens": 1000,
    "temperature": 0.7
  }
}

Response:

{
  "answer": "The capital of France is Paris.",
  "provider": "groq",
  "model": "llama-3.3-70b-versatile",
  "latency": 1247,
  "tokensUsed": 42
}

πŸ”§ Configuration

Middleware Order

The middleware stack runs in this order (defined in app.js):

  1. Logger - Logs every request
  2. Auth - Validates API key
  3. Rate Limiter - Prevents spam/abuse
  4. Validator - Validates request body (per-route)

⚠️ Order matters! Middleware runs sequentially.

Rate Limiting

Default configuration:

  • Free tier: 10 requests per minute
  • Pro tier: 100 requests per minute

Configure in middleware/rateLimit.js

Supported Providers

Groq

  • Model: llama-3.3-70b-versatile
  • API: OpenAI-compatible
  • Base URL: https://api.groq.com/openai/v1

Ollama

  • Runs locally
  • Supports multiple open-source models
  • Default URL: http://localhost:11434

πŸ›‘οΈ Error Handling

The API uses custom error classes for structured responses:

Error Types

Error Type Status Code Description
APIError Variable External API errors (Groq, Ollama)
ValidationError 400 Invalid request data
RateLimitError 429 Rate limit exceeded
AuthenticationError 401 Missing API key
AuthorizationError 403 Invalid API key

Example Error Response

{
  "error": "Invalid API key",
  "type": "authorization_error",
  "message": "The provided API key is not valid"
}

πŸ“Š Sample Output

Successful Chat Request

curl -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "message": "Explain quantum computing in simple terms",
    "provider": "groq"
  }'

Terminal Output:

πŸ” INCOMING REQUEST
   Method: POST
   Path: /api/chat
   IP: ::1
   Time: 2026-02-12T15:13:35.000Z

πŸ”‘ AUTH CHECK
   Received key: your-api-k...
   βœ… Valid API key

πŸ€– GROQ API CALL
   Model: llama-3.3-70b-versatile
   Message: Explain quantum computing in simple terms...
   βœ… Success (1247ms)
   Response: Quantum computing is a new way of processing...

API Response:

{
  "answer": "Quantum computing is a new way of processing information that uses quantum mechanics...",
  "provider": "groq",
  "model": "llama-3.3-70b-versatile",
  "latency": 1247,
  "tokensUsed": 156
}

Rate Limit Error

{
  "error": "Rate limit exceeded. Try again in 45 seconds.",
  "retryAfter": 45,
  "type": "rate_limit_error"
}

Authentication Error

{
  "error": "Authentication required",
  "message": "Please provide x-api-key header"
}

πŸ§ͺ Testing

Using cURL

# Test authentication
curl http://localhost:3000/test \
  -H "x-api-key: your-api-key"

# Send chat message
curl -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{"message": "Hello!", "provider": "groq"}'

Using Postman

  1. Import the collection (create one from the endpoints above)
  2. Set environment variables:
    • BASE_URL: http://localhost:3000
    • API_KEY: Your API key
  3. Test endpoints

πŸ”’ Security Best Practices

  1. Never commit .env file - It contains sensitive API keys
  2. Use strong API keys - Generate cryptographically secure keys
  3. Enable HTTPS in production
  4. Implement rate limiting - Already configured
  5. Validate all inputs - Already implemented
  6. Monitor logs for suspicious activity

πŸš€ Deployment

Environment Variables for Production

NODE_ENV=production
PORT=3000
API_KEY=<strong-secret-key>
GROQ_API_KEY=<your-groq-key>

Deployment Platforms

  • Heroku: git push heroku main
  • Railway: Connect GitHub repo
  • Render: Connect GitHub repo
  • AWS EC2: Use PM2 for process management
  • Docker: Build and deploy container

PM2 (Process Manager)

npm install -g pm2
pm2 start app.js --name llm-gateway
pm2 save
pm2 startup

πŸ“ˆ Future Enhancements

  • Add support for more providers (OpenAI, Anthropic, Cohere)
  • Implement conversation history/context
  • Add streaming responses
  • Database integration for user management
  • Analytics dashboard
  • WebSocket support for real-time chat
  • Token usage tracking and billing
  • Caching layer for repeated queries
  • Multi-model routing (automatically choose best model)

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Groq - Ultra-fast LLM inference
  • Ollama - Local LLM deployment
  • Express.js - Web framework
  • Llama 3.3 70B - Meta's powerful language model

Built with ❀️ by dishamurthy

Production-grade LLM API Gateway for the modern AI stack.

About

Production-grade Node.js API gateway for AI requests

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published