🚀 LLM API Gateway

A production-grade Node.js API gateway for managing LLM (Large Language Model) requests with support for multiple providers including Groq and Ollama. Built with enterprise-level middleware for authentication, rate limiting, logging, and validation.

✨ Features

Multi-Provider Support: Integrates with Groq (Llama 3.3 70B) and Ollama
Authentication Middleware: API key-based authentication
Rate Limiting: Prevent abuse with configurable rate limits
Request Logging: Comprehensive logging for debugging and monitoring
Input Validation: Schema-based validation for requests
Error Handling: Structured error responses with proper HTTP status codes
Health Checks: Monitor provider availability and latency
TypeScript-ready: Well-documented code with JSDoc comments

🏗️ Architecture

┌─────────────┐
│   Client    │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────┐
│  Express App (app.js)           │
│                                 │
│  ┌──────────────────────────┐  │
│  │  Middleware Stack        │  │
│  │  1. Logger               │  │
│  │  2. Auth (API Key)       │  │
│  │  3. Rate Limiter         │  │
│  │  4. Validator (routes)   │  │
│  └──────────────────────────┘  │
│                                 │
│  ┌──────────────────────────┐  │
│  │  Routes                  │  │
│  │  - /api/chat             │  │
│  │  - /test                 │  │
│  └──────────────────────────┘  │
└────────┬────────────────────────┘
         │
         ▼
┌─────────────────────────────────┐
│  Provider Layer                 │
│  ├─ Groq (Llama 3.3 70B)       │
│  └─ Ollama (Local)             │
└─────────────────────────────────┘

📁 Project Structure

llm-api-gateway/
├── app.js                 # Main Express application
├── middleware/
│   ├── auth.js           # API key authentication
│   ├── logger.js         # Request logging
│   ├── rateLimit.js      # Rate limiting
│   └── validator.js      # Input validation
├── providers/
│   ├── groq.js           # Groq API integration
│   └── ollama.js         # Ollama integration
├── routes/
│   └── chat.js           # Chat endpoints
├── utils/
│   └── errors.js         # Custom error classes
├── package.json
├── .env.example
└── README.md

🚦 Getting Started

Prerequisites

Node.js (v16+)
npm or yarn
Groq API key (get from console.groq.com)
Ollama (optional, for local models)

Installation

Clone the repository

git clone https://github.com/dishamurthy/llm-api-gateway.git
cd llm-api-gateway

Install dependencies
```
npm install
```

Set up environment variables

Create a .env file in the root directory:

# Server Configuration
PORT=3000
NODE_ENV=development

# API Gateway Authentication
API_KEY=your-secret-api-key-here

# Groq Configuration
GROQ_API_KEY=your-groq-api-key-here

# Ollama Configuration (optional)
OLLAMA_BASE_URL=http://localhost:11434

Start the server
```
npm start
```
The server will start on http://localhost:3000

📡 API Endpoints

Health Check

GET /

Response:

{
  "message": "AI Gateway API",
  "status": "running",
  "timestamp": "2026-02-12T15:13:35.000Z"
}

Test Authentication

GET /test
Headers:
  x-api-key: your-secret-api-key

Response:

{
  "message": "Test endpoint",
  "user": {
    "id": "user_123",
    "name": "Test User",
    "tier": "free"
  }
}

Chat Completion

POST /api/chat
Headers:
  x-api-key: your-secret-api-key
  Content-Type: application/json

Body:
{
  "message": "What is the capital of France?",
  "provider": "groq",
  "options": {
    "model": "llama-3.3-70b-versatile",
    "maxTokens": 1000,
    "temperature": 0.7
  }
}

Response:

{
  "answer": "The capital of France is Paris.",
  "provider": "groq",
  "model": "llama-3.3-70b-versatile",
  "latency": 1247,
  "tokensUsed": 42
}

🔧 Configuration

Middleware Order

The middleware stack runs in this order (defined in app.js):

Logger - Logs every request
Auth - Validates API key
Rate Limiter - Prevents spam/abuse
Validator - Validates request body (per-route)

⚠️ Order matters! Middleware runs sequentially.

Rate Limiting

Default configuration:

Free tier: 10 requests per minute
Pro tier: 100 requests per minute

Configure in middleware/rateLimit.js

Supported Providers

Groq

Model: llama-3.3-70b-versatile
API: OpenAI-compatible
Base URL: https://api.groq.com/openai/v1

Ollama

Runs locally
Supports multiple open-source models
Default URL: http://localhost:11434

🛡️ Error Handling

The API uses custom error classes for structured responses:

Error Types

Error Type	Status Code	Description
`APIError`	Variable	External API errors (Groq, Ollama)
`ValidationError`	400	Invalid request data
`RateLimitError`	429	Rate limit exceeded
`AuthenticationError`	401	Missing API key
`AuthorizationError`	403	Invalid API key

Example Error Response

{
  "error": "Invalid API key",
  "type": "authorization_error",
  "message": "The provided API key is not valid"
}

📊 Sample Output

Successful Chat Request

curl -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "message": "Explain quantum computing in simple terms",
    "provider": "groq"
  }'

Terminal Output:

🔍 INCOMING REQUEST
   Method: POST
   Path: /api/chat
   IP: ::1
   Time: 2026-02-12T15:13:35.000Z

🔑 AUTH CHECK
   Received key: your-api-k...
   ✅ Valid API key

🤖 GROQ API CALL
   Model: llama-3.3-70b-versatile
   Message: Explain quantum computing in simple terms...
   ✅ Success (1247ms)
   Response: Quantum computing is a new way of processing...

API Response:

{
  "answer": "Quantum computing is a new way of processing information that uses quantum mechanics...",
  "provider": "groq",
  "model": "llama-3.3-70b-versatile",
  "latency": 1247,
  "tokensUsed": 156
}

Rate Limit Error

{
  "error": "Rate limit exceeded. Try again in 45 seconds.",
  "retryAfter": 45,
  "type": "rate_limit_error"
}

Authentication Error

{
  "error": "Authentication required",
  "message": "Please provide x-api-key header"
}

🧪 Testing

Using cURL

# Test authentication
curl http://localhost:3000/test \
  -H "x-api-key: your-api-key"

# Send chat message
curl -X POST http://localhost:3000/api/chat \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{"message": "Hello!", "provider": "groq"}'

Using Postman

Import the collection (create one from the endpoints above)
Set environment variables:
- BASE_URL: http://localhost:3000
- API_KEY: Your API key
Test endpoints

🔒 Security Best Practices

Never commit .env file - It contains sensitive API keys
Use strong API keys - Generate cryptographically secure keys
Enable HTTPS in production
Implement rate limiting - Already configured
Validate all inputs - Already implemented
Monitor logs for suspicious activity

🚀 Deployment

Environment Variables for Production

NODE_ENV=production
PORT=3000
API_KEY=<strong-secret-key>
GROQ_API_KEY=<your-groq-key>

Deployment Platforms

Heroku: git push heroku main
Railway: Connect GitHub repo
Render: Connect GitHub repo
AWS EC2: Use PM2 for process management
Docker: Build and deploy container

PM2 (Process Manager)

npm install -g pm2
pm2 start app.js --name llm-gateway
pm2 save
pm2 startup

📈 Future Enhancements

Add support for more providers (OpenAI, Anthropic, Cohere)
Implement conversation history/context
Add streaming responses
Database integration for user management
Analytics dashboard
WebSocket support for real-time chat
Token usage tracking and billing
Caching layer for repeated queries
Multi-model routing (automatically choose best model)

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Groq - Ultra-fast LLM inference
Ollama - Local LLM deployment
Express.js - Web framework
Llama 3.3 70B - Meta's powerful language model

Built with ❤️ by dishamurthy

Production-grade LLM API Gateway for the modern AI stack.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
middleware		middleware
providers		providers
routes		routes
utils		utils
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.js		app.js
package-lock.json		package-lock.json
package.json		package.json

dishamurthy/llm-api-gateway

Folders and files

Latest commit

History

Repository files navigation

🚀 LLM API Gateway

✨ Features

🏗️ Architecture

📁 Project Structure

🚦 Getting Started

Prerequisites

Installation

📡 API Endpoints

Health Check

Test Authentication

Chat Completion

🔧 Configuration

Middleware Order

Rate Limiting

Supported Providers

Groq

Ollama

🛡️ Error Handling

Error Types

Example Error Response

📊 Sample Output

Successful Chat Request

Rate Limit Error

Authentication Error

🧪 Testing

Using cURL

Using Postman

🔒 Security Best Practices

🚀 Deployment

Environment Variables for Production

Deployment Platforms

PM2 (Process Manager)

📈 Future Enhancements

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages