Load testing tools for finding out how your API actually performs under stress
I built this to answer questions that always come up before launch: "Can our API handle 100 simultaneous users?" "What happens during a traffic spike?" "Will it stay stable under sustained load?"
This framework makes it easy to find out.
Performance testing tells you things functional tests can't:
- How many users can you handle before things slow down?
- Where are the bottlenecks in your system?
- Does performance degrade over time? (memory leaks, connection leaks, etc.)
- What happens when traffic suddenly spikes?
I built this to test the FastAPI application from my other project, but it works against any HTTP API.
# Clone and setup
git clone https://github.com/JasonTeixeira/Performance-Testing-Framework.git
cd Performance-Testing-Framework
# Virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtThe easiest way is using the CLI script:
# See available test scenarios
python run_load_test.py list
# Run a basic load test
python run_load_test.py run basic --host http://localhost:8000
# Quick smoke test (30 seconds)
python run_load_test.py quick http://localhost:8000Or run Locust directly:
locust -f locustfiles/api_load_test.py --host http://localhost:8000Then open http://localhost:8089 to see the web UI.
10 users, 2 minutes
Models realistic user behavior - logging in, browsing, updating profiles. Good for baseline performance.
python run_load_test.py run basic100 users, 1 minute, rapid spawn
Sudden traffic burst. Tests how your API handles when traffic jumps quickly (like getting featured on Reddit).
python run_load_test.py run spike20 users, 10 minutes
Sustained load over time. Catches memory leaks, connection pool issues, and performance degradation.
python run_load_test.py run endurance500 users, 5 minutes
Find the breaking point. Keeps ramping up until something gives.
python run_load_test.py run stressThe tests don't just hammer one endpoint. They model actual user behavior:
- User logs in (gets JWT token)
- Browses around (views profile, lists users)
- Updates things (profile changes)
- Health checks (like monitoring tools do)
Each action has a weight based on how often it happens in real usage. Profile views are more common than updates.
Users don't make requests instantly. The framework adds realistic "think time" between actions - usually 1-3 seconds, like a real person navigating.
For spike tests, wait time is much shorter (0.1-0.5s) to simulate frantic clicking.
The test automatically handles JWT authentication:
- Logs in once at the start
- Uses the token for all subsequent requests
- Handles expiration gracefully
This is important because authentication adds overhead to every request.
Locust shows several percentiles:
- p50 (median): Half of requests are faster than this
- p95: 95% of requests are faster (captures slow outliers)
- p99: 99% are faster (the really slow ones)
Don't just look at average - the p95 and p99 times matter more for user experience.
Any non-200 responses count as failures. Even 1-2% failure rate means users are seeing errors.
How much throughput your API can handle. This should stay consistent if performance is stable.
If RPS drops over time with the same number of users, something's degrading (probably memory or connections).
Performance-Testing-Framework/
├── locustfiles/ # Test scenarios
│ └── api_load_test.py # API load tests
├── scenarios/ # Additional test scenarios
├── utils/ # Helper utilities
├── config/ # Configuration files
│ └── test_config.yaml # Test settings
├── reports/ # Test results (generated)
├── run_load_test.py # CLI tool for easy execution
└── requirements.txt # Python dependencies
Building this taught me:
About Load Testing:
- The difference between load, stress, spike, and endurance testing
- Why you can't just use functional tests for performance
- How to model realistic user behavior instead of just hitting endpoints
- The importance of percentiles over averages
About Performance:
- Authentication overhead matters at scale
- Database connection pools are critical
- Memory leaks show up in long-running tests
- Even small inefficiencies multiply under load
About APIs:
- Rate limiting needs careful tuning
- Keep-alive connections make a huge difference
- JWT token validation is surprisingly expensive
- Database queries that seem fast become bottlenecks at scale
To test a different API:
- Create a new locustfile (copy
api_load_test.pyas a template) - Model your user flow (login, browse, checkout, etc.)
- Add realistic wait times between actions
- Set task weights based on actual usage patterns
- Start small (5-10 users) and ramp up
Don't start by testing with 1000 users - you'll just break everything and learn nothing. Start low, find issues, fix them, then increase.
"Connection refused"
- Is your API actually running?
- Check the host URL (http vs https)
"Too many failures"
- Your API might not be able to handle the load
- Reduce user count and try again
- Check API logs for errors
"Tests run but no requests"
- Authentication might be failing
- Check credentials in the locustfile
"Response times increasing over time"
- Memory leak or connection pool exhaustion
- This is what endurance tests catch - it's valuable data!
✅ Start with low user counts and ramp up ✅ Model realistic user behavior ✅ Run tests for several minutes (not just seconds) ✅ Test on an environment similar to production ✅ Monitor your API server during tests (CPU, memory, etc.)
❌ Test directly against production (use staging!) ❌ Just hit one endpoint repeatedly ❌ Ignore authentication in your tests ❌ Only look at average response times
Found a bug or have a scenario to add? Open an issue or PR!
Jason Teixeira
- GitHub: @JasonTeixeira
- Email: sage@sageideas.org
MIT License - use it however you want.
I chose Locust because:
- Python-based - Easy to extend with custom logic
- Distributed testing - Can run across multiple machines
- Real-time web UI - Watch results as tests run
- Realistic simulation - Models actual user behavior, not just request spam
- Open source - No licensing costs
Other tools are good too (JMeter, k6, Gatling), but Locust's Python API makes it very flexible for complex scenarios.