Design a rate limiter for:
- Login attempts: Max 3 in 10 minutes
- Video views: Max 20/day
- Comments: Max 100/day
Must support millions of users with low latency and high availability.
- Per-user and per-IP request limiting
- Configurable rules per API endpoint
- Real-time blocking if limit exceeded
- Latency β€ 20ms
- High throughput (10K+ RPS)
- 99.99% availability
- Low operational cost
- Horizontal scalability
[ Client ]
β
βΌ
[ API Gateway ]
β
βΌ
[ Rate Limiter (Lambda or ECS/Fargate) ]
β β²
β β
βΌ β
[ Redis (ElastiCache) ] β stores counters
β
βΌ
[ DynamoDB ] β stores config (limits per API)
Component | Purpose |
---|---|
API Gateway | Entry point, optional usage plans |
AWS Lambda (or Fargate) | Stateless compute for checking limits |
ElastiCache Redis | Fast, atomic counters with TTL |
DynamoDB | Stores API limit rules, fallback store |
CloudWatch + X-Ray | Monitoring + tracing |
user:123:/comments β Sorted Set [timestamp1, timestamp2...]
TTL: 24h
PK: API:/comment | SK: default
Limit: 100 | TimeWindow: 24h
- Fixed Window Counter for login
- Sliding Window (Sorted Set) for comment and view tracking
- Token Bucket for smoother burst control (optional)
- Redis is multi-AZ (clustered)
- Lambda scales automatically
- Fail-open for non-critical APIs (e.g., video view)
- Fail-close for critical ones (e.g., login attempts)
- CloudWatch custom metrics:
- Requests blocked
- Rule violations
- X-Ray for tracing Lambda execution
Option | Trade-off |
---|---|
IP vs User ID | IP is anonymous but prone to collisions |
Redis vs DB | Redis fast but volatile; fallback to DB |
Fail-open | Better UX, worse protection |
Sliding vs Token | Sliding is precise, token is burst-friendly |
This architecture:
- Handles real-time request limiting
- Is fully serverless (or containerized if needed)
- Leverages Redis + DynamoDB for speed and persistence
- Scales easily and is cost-efficient
If you have 30 minutes for this question:
- Spend first 5 mins clarifying scope
- Next 10 mins sketch out architecture & flow
- Last 15 mins write structured answer like above
(Include trade-offs and AWS reasoning β this is what bar-raisers look for!)