A high-performance Zabbix load testing tool designed to simulate thousands of Zabbix agents sending metrics to Zabbix proxies/servers. This tool helps you stress-test your Zabbix infrastructure and identify performance bottlenecks.
- Massive Scale: Simulate up to 50,000+ hosts simultaneously
- Active Agent Simulation: Simulates Zabbix agent (active) sending metrics via trapper protocol
- LLD Support: Simulates Low-Level Discovery (LLD) rules with configurable intervals
- Multi-Proxy Support: Distribute load across multiple Zabbix proxies
- Real-time TUI: Terminal User Interface with live statistics and graphs
- Performance Profiling: Built-in debug and profiling tools for troubleshooting
- Multi-Instance Support: Distribute load testing across multiple servers
- Smart Buffering: Configurable metric buffering with jitter to avoid thundering herd
- Trigger Simulation: Simulate hosts with firing triggers
- Self-Throttling: Automatically adjust load based on Zabbix server health
zbx-load-testing → Zabbix Proxy(s) → Zabbix Server → PostgreSQL
The tool simulates Zabbix agents by:
- Creating hosts in Zabbix via API
- Sending metrics using the Zabbix sender protocol (trapper)
- Optionally sending LLD data to trigger auto-discovery
- Monitoring Zabbix health and adjusting load dynamically
- Go 1.21 or later
- Access to a Zabbix server API
- Network access to Zabbix proxies/server on port 10051
# Clone the repository
git clone <repository-url>
cd zbx-load-testing
# Build the binary
go build -o zbx-load-testing ./cmd/zbx-load-testing
# Or use the justfile
just buildCreate a config.yaml file (see config.yaml for a complete example):
# Zabbix API Connection
zabbix_api:
url: "http://localhost:8088/api_jsonrpc.php"
user: "Admin"
password: "your-password"
timeout_sec: 120
# Zabbix Sender Configuration
zabbix_sender:
proxies:
- name: "proxy1"
ip: "192.168.1.10"
port: 10051
- name: "proxy2"
ip: "192.168.1.11"
port: 10051
port: 10051
buffer_size: 1000
buffer_send_sec: 5
buffer_jitter_percent: 0.2 # 20% jitter to spread load
# Test Configuration
test_run:
name: "LoadTest-01"
hosts_to_create: 10000
hosts_to_simulate: 10000
speed_multiplier: 1 # 1x = normal speed, 2x = 2x faster
randomize_start_time: true
staggering_window_sec: 3600 # Spread LLD over 1 hour
disable_lld: false
templates:
- "Linux by Zabbix agent active"
# Self-Throttling
throttling:
enabled: true
check_interval_sec: 60
queue_threshold: 10
history_syncer_busy_pct: 80
# Trigger Simulation
triggers:
firing_percentage: 5 # 5% of hosts will have firing triggers
state_change_interval_sec: 300 # Change state every 5 minutesTo distribute load across multiple servers:
test_run:
hosts_to_create: 30000
hosts_to_simulate: 30000
instance_id: 0 # Server 1: instance 0
total_instances: 3 # Total of 3 serversEach server gets assigned hosts round-robin style (server 0 gets hosts 0,3,6,9..., server 1 gets 1,4,7,10..., etc.)
Create hosts in Zabbix:
./zbx-load-testing setupThis will:
- Create host groups
- Create hosts based on templates
- Assign hosts to proxies
- Configure items and discovery rules
Start the load test:
./zbx-load-testing runThe TUI will display:
- Load generator statistics (hosts, NVPS, LLD/sec)
- Zabbix server health (queue, busy %, NVPS)
- Sender response times (min/max/avg/percentiles)
- Real-time NVPS graph
- Connections per second graph
- Event logs
q- Quit the application+/-- Adjust NVPS graph time scaleh- Time jump (add 1 hour when randomize_start_time enabled)d- Toggle debug mode (see below)p- Toggle profiling mode (see below)
When you observe issues with metric sending (e.g., using tcpdump), enable debug mode:
Press d to enable debug mode
This will:
- Create/append to
debug.logfile - Log all sender operations at DEBUG level
- Include details about:
- Metrics being sent (count, host, proxy)
- Buffer flush operations
- Send durations and responses
- Timer resets and scheduling
Example debug output:
time=2025-10-27T15:30:45Z level=DEBUG msg="Flushing metrics" count=150 host=LoadTest-0001 proxy=proxy1
time=2025-10-27T15:30:45Z level=DEBUG msg="Sending metrics" count=150 proxy=proxy1 proxy_addr=192.168.1.10:10051
time=2025-10-27T15:30:45Z level=DEBUG msg="Metrics sent successfully" duration=45ms responseInfo="processed: 150; failed: 0; total: 150; seconds spent: 0.045" proxy=proxy1 proxy_addr=192.168.1.10:10051
Press d again to disable debug mode
Debug logs help identify:
- Which hosts/proxies are affected
- If buffers are flushing
- Network send latency
- Protocol-level errors
When metrics stop sending and you need deeper analysis of goroutines and blocking:
Press p to enable profiling
This enables:
- Goroutine blocking profile collection
- Block profile rate tracking
Press p again to capture profile snapshot
This creates three files:
Binary profile showing all goroutines and their call stacks.
Analyze with:
# Interactive analysis
go tool pprof goroutine.prof
# Common commands inside pprof:
# - top : Show top goroutines
# - list : Show source code
# - web : Generate graph (requires graphviz)
# - traces : Show all stack traces
# Quick text output
go tool pprof -text goroutine.prof
# Generate SVG graph
go tool pprof -svg goroutine.prof > goroutines.svgLook for:
- High goroutine counts (potential leaks)
- Goroutines stuck in
chan sendorchan receive - Blocked on network I/O
Shows where goroutines are blocking on synchronization primitives.
Analyze with:
# Interactive analysis
go tool pprof block.prof
# Show blocking events
go tool pprof -text block.prof
# Generate graph of blocking
go tool pprof -svg block.prof > blocking.svgLook for:
- Mutex contention
- Channel blocking (full/empty channels)
- High blocking duration
Contains:
- Total goroutine count
- Buffer states for all hosts
- Channel saturation levels
Example snapshot:
Profile Snapshot - 2025-10-27T15:30:45Z
===========================================
Total Goroutines: 10523
Buffered Senders Status (10000 total):
Host Proxy Buf Size Buf Cap Chan Len Chan Cap
--------------------------------------------------------------------------------------------
LoadTest-0001 proxy1 10 1000 25 1000
LoadTest-0002 proxy1 150 1000 500 1000
LoadTest-0003 proxy2 0 1000 1000 1000 <- FULL CHANNEL!
...
What to look for:
- Full channels (Chan Len == Chan Cap): Metrics not being consumed, likely network I/O blocking
- Large buffer sizes: Metrics accumulating but not flushing
- Unexpectedly high goroutine count: Potential goroutine leak
Workflow:
-
Monitor with tcpdump to detect when metrics stop:
tcpdump -i any -n port 10051
-
When you see the issue, press
dto enable debug mode -
Wait 10-30 seconds to collect logs
-
Press
pto enable profiling -
Wait a few seconds to collect blocking data
-
Press
pagain to capture the profile snapshot -
Analyze the files:
# Check debug logs for sending activity tail -f debug.log # Check buffer states cat profile-snapshot.txt # Analyze goroutines go tool pprof goroutine.prof # Check for blocking go tool pprof block.prof
Common findings:
- Network I/O blocking: Goroutines stuck in network send, visible in goroutine profile
- Channel saturation: Full metric channels in snapshot, indicates buffering issues
- Mutex contention: High blocking on mutexes in block profile
- Proxy connectivity: Debug logs show connection errors to specific proxies
Remove all created hosts:
./zbx-load-testing cleanupFor non-interactive debugging:
# Run with debug output to console
./zbx-load-testing run --debug
# Adjust verbosity (0=warn, 1=info, 2=debug, 3=trace)
./zbx-load-testing run --debug -v 3- Calculated NVPS: Actual metrics sent by the tool
- Theoretical NVPS: Expected metrics based on item intervals
- Reported NVPS: What Zabbix server reports receiving
- DB Synced NVPS: What Zabbix has written to database
Shows response times for Zabbix sender protocol:
- Min/Max/Avg: Response time range
- Percentiles: P50, P75, P90, P95, P99, etc.
High percentiles indicate network or Zabbix server issues.
- Queue (>1m): Items waiting more than 1 minute
- Queue (>10m): Items waiting more than 10 minutes
- Process Busy %: How busy each Zabbix process type is
- High History Syncer busy % indicates database bottleneck
- High Trapper busy % indicates receiving bottleneck
Set buffer_jitter_percent to spread metric sending:
zabbix_sender:
buffer_send_sec: 5
buffer_jitter_percent: 0.2 # Each host gets random 0-1s offset-
Disable LLD if not testing discovery:
test_run: disable_lld: true
-
Increase speed multiplier for faster item intervals:
test_run: speed_multiplier: 2 # 2x faster than normal
-
Tune buffer sizes:
zabbix_sender: buffer_size: 5000 # Larger buffer buffer_send_sec: 10 # Flush less frequently
-
Distribute across proxies:
zabbix_sender: proxies: - name: "proxy1" - name: "proxy2" - name: "proxy3"
The generate_report option stores ALL sender delays and connection history in memory. For long-running tests with many hosts, this can consume significant memory.
Disable for long tests:
test_run:
generate_report: false- Check
zabbix_api.urlis correct - Verify credentials
- Ensure network connectivity
- Add at least one proxy to
zabbix_sender.proxies
- Check templates exist in Zabbix
- Verify API user has permissions
- Check logs for specific errors
- Enable debug mode (
dkey) - Check
debug.logfor send errors - Enable profiling (
pkey) to check for blocking - Use tcpdump to verify network traffic
- Check Zabbix proxy/server logs
- Disable
generate_report - Reduce
hosts_to_simulate - Check for goroutine leaks with profiling
zbx-load-testing/
├── cmd/zbx-load-testing/ # Main application
│ ├── main.go
│ ├── setup.go # Host creation
│ ├── run.go # Load test execution
│ └── cleanup.go # Host removal
├── internal/
│ ├── config/ # Configuration management
│ ├── tui/ # Terminal UI
│ └── zabbix/
│ ├── api/ # Zabbix API client
│ ├── sender.go # Zabbix sender protocol
│ └── buffered_sender.go # Buffered metric sending
└── config.yaml # Configuration file
When contributing, please:
- Follow Go best practices
- Add tests for new features
- Update documentation
- Use
just buildto build - Test with various load levels
This project is licensed under the GNU General Public License v3.0 (GPL-3.0).
This means you are free to:
- Use the software for any purpose
- Study and modify the source code
- Share the software with others
- Share your modifications
However, if you distribute modified versions, you must:
- Make the modified source code available under GPL-3.0
- Document the changes you made
- Include the same license
See the LICENSE file for the full license text.
Developed for stress-testing and validating Zabbix infrastructure at scale.