A comprehensive, production-ready API rate limiting service built with Spring Boot, Redis, and best practices. Implements both Token Bucket and Sliding Window algorithms for flexible rate limiting strategies.
-
✅ Two Rate Limiting Algorithms
- Token Bucket: Allows bursts while maintaining average rate
- Sliding Window: Precise rate limiting with smooth distribution
-
✅ Redis-Backed: Scalable and distributed rate limiting across multiple instances
-
✅ Flexible Key Strategies
- IP-based limiting
- User-based limiting
- API key-based limiting
- Custom header-based limiting
-
✅ Annotation-Driven: Simple
@RateLimitannotation for declarative rate limiting -
✅ Configurable: YAML-based configuration with per-endpoint customization
-
✅ Response Headers: Automatic rate limit info in response headers (X-RateLimit-*)
-
✅ Fail-Safe: Fails open if Redis is unavailable
-
✅ Production-Ready: Comprehensive tests, logging, and error handling
The Token Bucket algorithm maintains a bucket of tokens that refills at a constant rate:
- Each request consumes one token
- Tokens refill at a fixed rate (e.g., 10 tokens/minute)
- Allows bursts up to bucket capacity
- Best for APIs that can handle occasional traffic spikes
Bucket Capacity: 100 tokens
Refill Rate: 10 tokens/second
Result: Sustains 10 req/sec, allows bursts up to 100
The Sliding Window algorithm tracks requests in a rolling time window:
- Counts requests in the last N seconds
- Removes old requests as the window slides
- Provides precise, consistent rate limiting
- Best for APIs requiring strict limits
Limit: 100 requests
Window: 60 seconds
Result: Exactly 100 requests per rolling 60-second period
- Java 17+
- Redis server running on localhost:6379
- Gradle
- Clone the repository:
git clone <repository-url>
cd rate-limiter-service-api- Start Redis:
# Using Docker
docker run -d -p 6379:6379 redis:latest
# Or using local Redis
redis-server- Build and run:
./gradlew bootRunThe service will start on http://localhost:8080
The simplest way to apply rate limiting is using the @RateLimit annotation:
@RestController
@RequestMapping("/api")
public class MyController {
// Default: 100 req/min using Token Bucket
@RateLimit
@GetMapping("/resource")
public ResponseEntity<?> getResource() {
return ResponseEntity.ok("Success");
}
// Strict: 5 req/min using Sliding Window
@RateLimit(limit = 5, windowSeconds = 60, algorithm = "SLIDING_WINDOW")
@GetMapping("/strict")
public ResponseEntity<?> strictEndpoint() {
return ResponseEntity.ok("Success");
}
// Bursty: 20 req/min with burst capacity of 50
@RateLimit(
limit = 20,
windowSeconds = 60,
algorithm = "TOKEN_BUCKET",
burstCapacity = 50
)
@GetMapping("/bursty")
public ResponseEntity<?> burstyEndpoint() {
return ResponseEntity.ok("Success");
}
// User-based: 10 req/min per user
@RateLimit(
limit = 10,
windowSeconds = 60,
keyResolver = "USER"
)
@GetMapping("/user-resource")
public ResponseEntity<?> userResource() {
return ResponseEntity.ok("Success");
}
// API Key-based: 100 req/hour per API key
@RateLimit(
limit = 100,
windowSeconds = 3600,
algorithm = "SLIDING_WINDOW",
keyResolver = "API_KEY"
)
@GetMapping("/api-resource")
public ResponseEntity<?> apiResource() {
return ResponseEntity.ok("Success");
}
}| Parameter | Type | Default | Description |
|---|---|---|---|
limit |
int | 100 | Maximum requests in time window |
windowSeconds |
int | 60 | Time window in seconds |
algorithm |
String | TOKEN_BUCKET | Algorithm: TOKEN_BUCKET or SLIDING_WINDOW |
keyResolver |
String | IP | Key strategy: IP, USER, API_KEY, or CUSTOM |
keyExpression |
String | "" | Custom key expression for CUSTOM resolver |
burstCapacity |
int | -1 | Burst capacity for Token Bucket (default: limit) |
refillRate |
double | -1.0 | Refill rate for Token Bucket (default: limit/windowSeconds) |
disabled |
boolean | false | Temporarily disable rate limiting |
You can also use the service directly:
@Service
public class MyService {
@Autowired
private RateLimiterService rateLimiterService;
public void doSomething(String userId) {
RateLimitConfig config = RateLimitConfig.tokenBucket(10, 60);
RateLimitResult result = rateLimiterService.checkRateLimit(userId, config);
if (!result.isAllowed()) {
throw new RateLimitExceededException(result);
}
// Process request...
}
}rate-limiter:
# Enable/disable rate limiting globally
enabled: true
# Default algorithm: TOKEN_BUCKET or SLIDING_WINDOW
default-algorithm: TOKEN_BUCKET
# Default rate limit
default-limit: 100
default-window-seconds: 60
# Redis configuration
redis-key-prefix: rate_limiter
redis-key-ttl: 3600
# Include rate limit headers in response
include-headers: true
# Per-endpoint configuration
endpoints:
"/api/auth/**":
algorithm: SLIDING_WINDOW
limit: 5
window-seconds: 60
key-resolver: IP
"/api/public/**":
algorithm: TOKEN_BUCKET
limit: 1000
window-seconds: 3600
burst-capacity: 1500spring:
data:
redis:
host: localhost
port: 6379
password:
database: 0
timeout: 2000
jedis:
pool:
max-active: 8
max-idle: 8
min-idle: 0Try these endpoints to see rate limiting in action:
# Default rate limit (100 req/min)
curl http://localhost:8080/api/demo/default
# Strict rate limit (5 req/min)
curl http://localhost:8080/api/demo/strict
# Per-second rate limit (2 req/sec)
curl http://localhost:8080/api/demo/per-second
# User-based rate limit
curl -H "X-User-ID: user123" http://localhost:8080/api/demo/user-based
# API key-based rate limit
curl -H "X-API-Key: myapikey" http://localhost:8080/api/demo/api-key-based
# Custom header rate limit
curl -H "X-Client-ID: client123" http://localhost:8080/api/demo/custom-header# Check rate limit status
curl "http://localhost:8080/api/rate-limit/check/test-user?limit=10&windowSeconds=60"
# Reset rate limit for a key
curl -X DELETE http://localhost:8080/api/rate-limit/reset/test-user
# Test Token Bucket
curl -X POST "http://localhost:8080/api/rate-limit/test/token-bucket?key=test&limit=5&windowSeconds=10"
# Test Sliding Window
curl -X POST "http://localhost:8080/api/rate-limit/test/sliding-window?key=test&limit=5&windowSeconds=10"
# Health check
curl http://localhost:8080/api/rate-limit/healthWhen rate limiting is active, the following headers are included:
X-RateLimit-Limit: 100 # Total allowed requests
X-RateLimit-Remaining: 95 # Remaining requests
X-RateLimit-Reset: 45 # Seconds until reset
Retry-After: 15 # Seconds to wait (if rate limited)
When rate limit is exceeded, you'll receive a 429 response:
{
"timestamp": "2025-01-15T10:30:00",
"status": 429,
"error": "Too Many Requests",
"message": "Rate limit exceeded. Please try again later.",
"limit": 100,
"remaining": 0,
"resetAfterSeconds": 45,
"retryAfterSeconds": 45
}@RateLimit(keyResolver = "IP")Limits based on client IP address. Considers proxy headers (X-Forwarded-For, X-Real-IP).
@RateLimit(keyResolver = "USER")Limits based on authenticated user. Checks:
- X-User-ID header
- Principal name from security context
- Session ID
- Falls back to IP if none found
@RateLimit(keyResolver = "API_KEY")Limits based on API key. Checks:
- X-API-Key header
- Authorization header (Bearer token)
- api_key query parameter
- Falls back to IP if none found
@RateLimit(
keyResolver = "CUSTOM",
keyExpression = "header:X-Tenant-ID"
)Limits based on custom expression. Supports:
header:HeaderName- Extract from HTTP headerparam:ParamName- Extract from query parameter
Run the comprehensive test suite:
# Run all tests
./gradlew test
# Run specific test class
./gradlew test --tests TokenBucketRateLimiterTest
# Run with coverage
./gradlew test jacocoTestReportUse Token Bucket when:
- Your API can handle occasional bursts
- You want to allow clients to "save up" capacity
- Traffic patterns are variable
- Example: Media upload APIs, batch processing APIs
Use Sliding Window when:
- You need precise, consistent limits
- You want to prevent gaming the system
- Compliance requires strict rate limiting
- Example: Authentication APIs, payment APIs, SMS sending
- Start Conservative: Begin with strict limits and loosen as needed
- Different Limits for Different Operations: Read operations can have higher limits than writes
- Authenticated vs Anonymous: Give authenticated users higher limits
- Monitor and Adjust: Use logging and metrics to tune limits based on actual usage
- Use Appropriate Windows: Shorter windows (1-60 sec) for real-time APIs, longer windows (1+ hour) for resource-intensive operations
- Configure Redis for high availability (Sentinel/Cluster)
- Set appropriate Redis connection pool sizes
- Enable Redis persistence for rate limit state
- Monitor Redis memory usage
- Set up alerts for rate limit violations
- Document rate limits in API documentation
- Implement retry logic in clients
- Consider using API gateway for additional protection
┌─────────────┐
│ Client │
└──────┬──────┘
│ HTTP Request
▼
┌─────────────────────────────────┐
│ Spring Boot Application │
│ │
│ ┌─────────────────────────┐ │
│ │ RateLimitInterceptor │ │
│ └───────────┬─────────────┘ │
│ │ │
│ ┌───────────▼─────────────┐ │
│ │ RateLimiterService │ │
│ └───────────┬─────────────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ │ │ │
│ ┌────▼────┐ ┌───▼────┐ │
│ │ Token │ │ Sliding│ │
│ │ Bucket │ │ Window │ │
│ └────┬────┘ └───┬────┘ │
│ │ │ │
└───────┼────────────┼──────────┘
│ │
▼ ▼
┌───────────────────┐
│ Redis Server │
│ (Rate Limit │
│ State Storage) │
└───────────────────┘
The rate limiter is designed for high performance:
- Atomic Operations: Uses Redis Lua scripts for atomicity
- O(1) Complexity: Token Bucket operations are constant time
- Efficient Cleanup: Sliding Window automatically removes old entries
- Connection Pooling: Jedis connection pool for optimal throughput
- Fail-Safe: Continues operation if Redis is unavailable
Benchmarks (on modest hardware):
- Token Bucket: ~5000 checks/second per instance
- Sliding Window: ~3000 checks/second per instance
Error: Unable to connect to Redis
Solution: Ensure Redis is running and accessible:
redis-cli ping
# Should return: PONGCheck:
- Is rate limiting enabled in config? (
rate-limiter.enabled: true) - Is the endpoint annotated with
@RateLimit? - Is Redis accessible and storing data?
- Check logs for errors
# Rapid fire requests to test
for i in {1..10}; do
curl -w "\nStatus: %{http_code}\n" http://localhost:8080/api/demo/strict
sleep 0.1
doneThis project is licensed under the MIT License.
Contributions are welcome! Please feel free to submit pull requests.
For issues and questions, please create an issue in the repository.