A production-grade Node.js API gateway for managing LLM (Large Language Model) requests with support for multiple providers including Groq and Ollama. Built with enterprise-level middleware for authentication, rate limiting, logging, and validation.
- Multi-Provider Support: Integrates with Groq (Llama 3.3 70B) and Ollama
- Authentication Middleware: API key-based authentication
- Rate Limiting: Prevent abuse with configurable rate limits
- Request Logging: Comprehensive logging for debugging and monitoring
- Input Validation: Schema-based validation for requests
- Error Handling: Structured error responses with proper HTTP status codes
- Health Checks: Monitor provider availability and latency
- TypeScript-ready: Well-documented code with JSDoc comments
βββββββββββββββ
β Client β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Express App (app.js) β
β β
β ββββββββββββββββββββββββββββ β
β β Middleware Stack β β
β β 1. Logger β β
β β 2. Auth (API Key) β β
β β 3. Rate Limiter β β
β β 4. Validator (routes) β β
β ββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββ β
β β Routes β β
β β - /api/chat β β
β β - /test β β
β ββββββββββββββββββββββββββββ β
ββββββββββ¬βββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Provider Layer β
β ββ Groq (Llama 3.3 70B) β
β ββ Ollama (Local) β
βββββββββββββββββββββββββββββββββββ
llm-api-gateway/
βββ app.js # Main Express application
βββ middleware/
β βββ auth.js # API key authentication
β βββ logger.js # Request logging
β βββ rateLimit.js # Rate limiting
β βββ validator.js # Input validation
βββ providers/
β βββ groq.js # Groq API integration
β βββ ollama.js # Ollama integration
βββ routes/
β βββ chat.js # Chat endpoints
βββ utils/
β βββ errors.js # Custom error classes
βββ package.json
βββ .env.example
βββ README.md
- Node.js (v16+)
- npm or yarn
- Groq API key (get from console.groq.com)
- Ollama (optional, for local models)
-
Clone the repository
git clone https://github.com/dishamurthy/llm-api-gateway.git cd llm-api-gateway -
Install dependencies
npm install
-
Set up environment variables
Create a
.envfile in the root directory:# Server Configuration PORT=3000 NODE_ENV=development # API Gateway Authentication API_KEY=your-secret-api-key-here # Groq Configuration GROQ_API_KEY=your-groq-api-key-here # Ollama Configuration (optional) OLLAMA_BASE_URL=http://localhost:11434
-
Start the server
npm start
The server will start on
http://localhost:3000
GET /Response:
{
"message": "AI Gateway API",
"status": "running",
"timestamp": "2026-02-12T15:13:35.000Z"
}GET /test
Headers:
x-api-key: your-secret-api-keyResponse:
{
"message": "Test endpoint",
"user": {
"id": "user_123",
"name": "Test User",
"tier": "free"
}
}POST /api/chat
Headers:
x-api-key: your-secret-api-key
Content-Type: application/json
Body:
{
"message": "What is the capital of France?",
"provider": "groq",
"options": {
"model": "llama-3.3-70b-versatile",
"maxTokens": 1000,
"temperature": 0.7
}
}Response:
{
"answer": "The capital of France is Paris.",
"provider": "groq",
"model": "llama-3.3-70b-versatile",
"latency": 1247,
"tokensUsed": 42
}The middleware stack runs in this order (defined in app.js):
- Logger - Logs every request
- Auth - Validates API key
- Rate Limiter - Prevents spam/abuse
- Validator - Validates request body (per-route)
Default configuration:
- Free tier: 10 requests per minute
- Pro tier: 100 requests per minute
Configure in middleware/rateLimit.js
- Model:
llama-3.3-70b-versatile - API: OpenAI-compatible
- Base URL:
https://api.groq.com/openai/v1
- Runs locally
- Supports multiple open-source models
- Default URL:
http://localhost:11434
The API uses custom error classes for structured responses:
| Error Type | Status Code | Description |
|---|---|---|
APIError |
Variable | External API errors (Groq, Ollama) |
ValidationError |
400 | Invalid request data |
RateLimitError |
429 | Rate limit exceeded |
AuthenticationError |
401 | Missing API key |
AuthorizationError |
403 | Invalid API key |
{
"error": "Invalid API key",
"type": "authorization_error",
"message": "The provided API key is not valid"
}curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-H "x-api-key: your-api-key" \
-d '{
"message": "Explain quantum computing in simple terms",
"provider": "groq"
}'Terminal Output:
π INCOMING REQUEST
Method: POST
Path: /api/chat
IP: ::1
Time: 2026-02-12T15:13:35.000Z
π AUTH CHECK
Received key: your-api-k...
β
Valid API key
π€ GROQ API CALL
Model: llama-3.3-70b-versatile
Message: Explain quantum computing in simple terms...
β
Success (1247ms)
Response: Quantum computing is a new way of processing...
API Response:
{
"answer": "Quantum computing is a new way of processing information that uses quantum mechanics...",
"provider": "groq",
"model": "llama-3.3-70b-versatile",
"latency": 1247,
"tokensUsed": 156
}{
"error": "Rate limit exceeded. Try again in 45 seconds.",
"retryAfter": 45,
"type": "rate_limit_error"
}{
"error": "Authentication required",
"message": "Please provide x-api-key header"
}# Test authentication
curl http://localhost:3000/test \
-H "x-api-key: your-api-key"
# Send chat message
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-H "x-api-key: your-api-key" \
-d '{"message": "Hello!", "provider": "groq"}'- Import the collection (create one from the endpoints above)
- Set environment variables:
BASE_URL:http://localhost:3000API_KEY: Your API key
- Test endpoints
- Never commit
.envfile - It contains sensitive API keys - Use strong API keys - Generate cryptographically secure keys
- Enable HTTPS in production
- Implement rate limiting - Already configured
- Validate all inputs - Already implemented
- Monitor logs for suspicious activity
NODE_ENV=production
PORT=3000
API_KEY=<strong-secret-key>
GROQ_API_KEY=<your-groq-key>- Heroku:
git push heroku main - Railway: Connect GitHub repo
- Render: Connect GitHub repo
- AWS EC2: Use PM2 for process management
- Docker: Build and deploy container
npm install -g pm2
pm2 start app.js --name llm-gateway
pm2 save
pm2 startup- Add support for more providers (OpenAI, Anthropic, Cohere)
- Implement conversation history/context
- Add streaming responses
- Database integration for user management
- Analytics dashboard
- WebSocket support for real-time chat
- Token usage tracking and billing
- Caching layer for repeated queries
- Multi-model routing (automatically choose best model)
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Groq - Ultra-fast LLM inference
- Ollama - Local LLM deployment
- Express.js - Web framework
- Llama 3.3 70B - Meta's powerful language model
Built with β€οΈ by dishamurthy
Production-grade LLM API Gateway for the modern AI stack.