This directory contains load testing scripts for the OER-AI application using Locust, an open-source load testing tool written in Python.
Locust is a scalable, distributed load testing framework that allows you to:
- Define user behavior in Python code
- Simulate thousands of concurrent users
- Monitor performance in real-time via web UI
- Generate detailed performance reports
Key Features:
- Code-based: Write test scenarios in Python (no XML/YAML)
- Distributed: Run tests across multiple machines
- Real-time monitoring: Web UI shows live statistics
- Flexible: Simulate complex user behaviors
- Python 3.11+ installed
- pip package manager
pip install locust websocket-clientVerify installation:
locust --versionMain load testing script that simulates realistic user behavior:
- User Session Creation - Create authenticated user session
- Browse Textbooks - List textbooks with pagination
- View Textbook Details - Navigate to individual textbook pages
- Chat with LLM - Create chat sessions and send messages via WebSocket
- View FAQ - Fetch frequently asked questions for textbook
- Generate Practice Material - Create practice questions/flashcards via WebSocket
Task Weights:
- Browse textbooks: 2 (low - periodic browsing)
- View textbook details: 10 (high - users spend most time here)
- Chat with LLM: 8 (high - primary user activity)
- View FAQ: 3 (medium - occasional reference)
- Generate practice material: 5 (medium-high - important feature)
Start Locust with the web interface:
cd tests
locust -f locustfile.py --host=https://qscs7f1rm2.execute-api.ca-central-1.amazonaws.comThen:
- Open http://localhost:8089 in your browser
- Set number of users (e.g., 10)
- Set spawn rate (e.g., 1 user/second)
- Click "Start swarming"
Run without web UI for automated testing:
cd tests
locust -f locustfile.py \
--host=https://qscs7f1rm2.execute-api.ca-central-1.amazonaws.com \
--headless \
-u 10 \
-r 1 \
--run-time 5mParameters:
-u 10: Simulate 10 concurrent users-r 1: Spawn 1 user per second--run-time 5m: Run for 5 minutes
For high-scale testing, run Locust in distributed mode:
Master:
locust -f locustfile.py --master --host=https://qscs7f1rm2.execute-api.ca-central-1.amazonaws.comWorkers (run on multiple machines):
locust -f locustfile.py --worker --master-host=<master-ip>When using the web UI, you'll see:
- Statistics Table: Request counts, response times (median, 95th percentile), error rates
- Charts: Real-time graphs of RPS (requests per second) and response times
- Failures: Detailed error messages and counts
- Current Users: Number of active simulated users
In headless mode, Locust prints periodic statistics:
Type Name # reqs # fails | Avg Min Max Median | req/s failures/s
--------|---------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
GET /public/config/welcomeMessage 45 0(0.00%) | 120 89 250 110 | 1.50 0.00
GET /textbooks 112 0(0.00%) | 135 95 310 130 | 3.73 0.00
GET /textbooks/{id} 225 0(0.00%) | 142 100 340 140 | 7.50 0.00
POST /textbooks/{id}/chat_sessions (create) 89 0(0.00%) | 156 110 380 150 | 2.97 0.00
--------|---------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 471 0(0.00%) | 140 89 380 135 | 15.70 0.00
While load testing, monitor these CloudWatch metrics:
- Count: Total requests
- Latency: Response times (p50, p90, p99)
- 4XXError: Client errors (401, 403, 429)
- 5XXError: Server errors
- Invocations: Number of calls
- Duration: Execution time
- Errors: Failed invocations
- Throttles: Rate-limited requests
- ConcurrentExecutions: Active instances
- DatabaseConnections: Active connections
- CPUUtilization: CPU usage
- ReadLatency / WriteLatency: Query performance
Check these log groups:
<StackPrefix>-ApiAccessLogs- API access logsAPI-Gateway-Execution-Logs_<api-id>/prod- API execution logs/aws/lambda/<function-name>- Lambda function logs/aws/rds/instance/<instance-id>/postgresql- RDS logs
- ✅ Fetch welcome message
- ✅ Browse textbooks with pagination
- ✅ View individual textbook details
- ✅ Token refresh on expiration
- ✅ Create user session via API
- ✅ Create chat session for textbook
- ✅ Establish WebSocket connection with token auth
- ✅ Send chat messages via WebSocket
- ✅ Handle streaming responses (start → chunks → complete)
- ✅ Heartbeat/ping-pong mechanism
- ✅ Token refresh after user session creation
- ✅ Fetch FAQ list for textbook
- ✅ Handle 404 gracefully (no FAQs yet)
- ✅ Generate practice materials via WebSocket
- ✅ Track practice material progress (initializing → retrieving → generating → validating → complete)
- ✅ Support multiple material types (MCQ, flashcard, short answer)
- ✅ Random difficulty selection (beginner, intermediate, advanced)
- ✅ Automatic token refresh on 401 errors
- ✅ WebSocket reconnection on disconnect
- ✅ Graceful handling of 404 errors
- ✅ Response validation and error reporting
Based on initial testing:
| Endpoint | Avg Response Time | 95th Percentile | Notes |
|---|---|---|---|
/public/config/welcomeMessage |
~120ms | ~200ms | Simple config fetch |
/user_sessions (POST) |
~150ms | ~250ms | User session creation |
/textbooks |
~135ms | ~250ms | Database query with pagination |
/textbooks/{id} |
~142ms | ~280ms | Single textbook lookup |
/textbooks/{id}/faq |
~140ms | ~260ms | FAQ list fetch |
| Chat session creation | ~156ms | ~300ms | Database insert |
| WebSocket chat | ~2-5s | ~8s | LLM generation (streaming) |
| WebSocket practice material | ~3-8s | ~12s | LLM generation with validation |
1. Token Fetch Fails
✗ Failed to fetch token. Status: 403
Solution: Check API Gateway endpoint and ensure /user/publicToken is accessible.
2. WebSocket Connection Fails
[WebSocket] ✗ Error: Connection refused
Solution: Verify WebSocket URL and ensure token is valid.
3. 429 Too Many Requests
Got status code 429
Solution: Reduce number of users or spawn rate. Check API Gateway throttling limits.
4. Chat Session Creation Fails
Got status code 404
Solution: Ensure textbook ID exists. Check that endpoint path is correct.
Enable verbose logging:
locust -f locustfile.py --host=... --loglevel DEBUG- Start Small: Begin with 1-5 users to verify functionality
- Ramp Gradually: Increase load slowly to find breaking points
- Monitor CloudWatch: Watch for errors, throttling, and high latency
- Test Realistic Scenarios: Match actual user behavior patterns
- Run During Off-Peak: Avoid impacting real users
- Document Results: Record baselines and performance changes
# 1. Smoke test (verify functionality)
locust -f locustfile.py --host=... --headless -u 1 -r 1 --run-time 1m
# 2. Light load (baseline performance)
locust -f locustfile.py --host=... --headless -u 5 -r 1 --run-time 5m
# 3. Medium load (typical usage)
locust -f locustfile.py --host=... --headless -u 10 -r 1 --run-time 10m
# 4. Stress test (find limits)
locust -f locustfile.py --host=... --headless -u 20 -r 2 --run-time 15m
# 5. Spike test (sudden traffic) - CAUTION: May trigger WAF/rate limits
locust -f locustfile.py --host=... --headless -u 50 -r 5 --run-time 5m
⚠️ Important: High user counts (50+) or fast spawn rates (10+/sec) may trigger WAF rules or API Gateway rate limits, resulting in 403 errors. Start small and gradually increase load to find safe limits.
name: Load Test
on:
schedule:
- cron: "0 2 * * *" # Daily at 2 AM
workflow_dispatch:
jobs:
load-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.11"
- name: Install dependencies
run: |
pip install locust websocket-client
- name: Run load test
run: |
cd tests
locust -f locustfile.py \
--host=https://qscs7f1rm2.execute-api.ca-central-1.amazonaws.com \
--headless \
-u 10 \
-r 1 \
--run-time 5m \
--html=report.html
- name: Upload report
uses: actions/upload-artifact@v3
with:
name: load-test-report
path: tests/report.htmlFor issues or questions:
- Check CloudWatch logs for errors
- Review Locust console output
- Verify API endpoints in Swagger definition
- Check network connectivity and authentication