tn-bench v2.2

tn-bench is an OpenSource software script that benchmarks your system and collects various statistical information via the TrueNAS API. It creates a dataset in each of your pools during testing, consuming 20 GiB of space for each thread in your system.

🆕 What's New in v2.2

ARC Statistics Telemetry (arcstat)

Real-time ZFS ARC monitoring during READ benchmark phases
Measures cache hit rate, ARC size, MRU/MFU distribution, and prefetch effectiveness
Auto-detects L2ARC presence — L2ARC metrics omitted entirely on systems without cache devices
Per-thread-count analysis shows how ARC performance changes with workload scale

Enhanced Zpool Latency Analytics

Fixed critical column mapping bug: zpool iostat -l fields are interleaved read/write pairs, not grouped by type
Latency unit auto-scaling: Displays μs when mean < 1ms (NVMe-class storage), ms otherwise
Per-thread-count latency breakdown with P99 ratings and CV% consistency metrics

L2ARC Auto-Detection

Detects cache devices via zpool status before starting telemetry collection
Prevents arcstat crashes on systems without L2ARC hardware
Dynamic field list: 18 fields (core + zfetch) without L2ARC, 21 fields with L2ARC

Previous: What's New in v2.1

Automatic Analytics

Post-benchmark analysis automatically identifies scaling patterns
Generates _analytics.json with structured performance data
Generates _report.md with human-readable markdown tables
Neutral data presentation — reports observations without judgment

Delta-Based Scaling Analysis

Tracks performance changes between thread count steps
Identifies optimal thread count for each pool
Shows thread efficiency (MB/s per thread at peak)
Highlights notable transitions (gains, losses, plateaus)

Per-Disk Pool Comparison

Compares individual disk performance to pool average
Shows variance percentage within each pool
Identifies outliers using % of pool max metric

Unified Telemetry Formatter

Single source of truth for console UI and markdown reports
Console output is now a "live preview" of the report content
Consistent formatting, CV% ratings, and table layouts
Future changes only need to happen in one place

Codebase Audit & Cleanup

Consolidated disk benchmark modules (removed disk_raw.py)
Removed ~250 lines of dead/stale code
Unified duplicate formatting logic
Reduced total module count from 16 to 15
Fixed edge-case bug in error handling

Previous: What's New in v2.0

Modular Architecture

tn-bench v2.0 has been completely refactored into a modular architecture. While the user experience remains identical to v1.x, the underlying codebase is now organized into clean, maintainable modules:

tn-bench/
├── truenas-bench.py          # Main coordinator (thin UI layer)
├── core/                     # Core functionality
│   ├── __init__.py          # System/pool/disk API calls
│   ├── dataset.py           # Dataset lifecycle management
│   ├── results.py           # JSON output handling
│   ├── analytics.py         # Scaling analysis engine (v2.1)
│   ├── report_generator.py  # Markdown report generation (v2.1)
│   ├── telemetry_formatter.py  # Unified console/markdown formatter (v2.1)
│   └── zpool_iostat_collector.py  # ZFS pool iostat telemetry (v2.1)
├── benchmarks/              # Benchmark implementations
│   ├── __init__.py          # Exports benchmark classes
│   ├── base.py              # Abstract base class
│   ├── zfs_pool.py          # ZFS pool write/read benchmark
│   └── disk_enhanced.py     # Individual disk benchmark (v2.0)
└── utils/                   # Common utilities
    └── __init__.py          # Colors, formatting, print functions

Benefits of this design:

Easier Maintenance: Each component is isolated and testable
Simple Extensibility: New benchmarks can be added by inheriting from BenchmarkBase
Clear Separation: UI, core logic, and benchmarks are cleanly separated
Reusable Components: Core utilities can be shared across benchmarks

See ARCHITECTURE.md for detailed documentation on the modular design.

Features

Modular Architecture: Clean separation between UI, core logic, and benchmarks
Enhanced Disk Benchmarking: Multiple test modes (serial, parallel, seek-stress) and configurable block sizes
Collects system information using TrueNAS API.
Benchmarks system performance using dd command.
Provides detailed information about system, pools, and disks.
Supports multiple pools with interactive selection.
Configurable iteration counts for both pool and disk benchmarks.
Space validation before running benchmarks.
Drive Writes Per Day (DWPD) calculation for pool benchmarks.
Colorized output for better readability.
JSON output with structured schema for sharing results.
Extensible: Easy to add new benchmark types via the BenchmarkBase class

Running the Script is a simple git clone

Please note, this script needs to be run as `root`.

Interactive Mode (default):

git clone -b tn-bench-2.2 https://github.com/nickf1227/tn-bench.git && cd tn-bench && python3 truenas-bench.py

Unattended Mode (v2.3+): For automated runs, CI/CD, or batch testing, use --unattended with CLI arguments:

python3 truenas-bench.py --unattended --pools all --zfs-iterations 2 --disk-iterations 0 --confirm

See Unattended Mode section for full CLI reference.

NOTE: /dev/urandom generates inherently uncompressible data, the the value of the compression options above is minimal in the current form.

The script will display system and pool information, then prompt you to continue with the benchmarks. Follow the prompts to complete the benchmarking process.

Benchmarking Process

Dataset Creation: The script creates a temporary dataset in each pool. The dataset is created with a 1M Record Size with no Compression and sync=Disabled using midclt call pool.dataset.create
Space Validation: Before running benchmarks, the script checks available space in the dataset and warns if insufficient (requires 20 GiB × thread count). You can choose to proceed anyway or skip the pool.
Pool Write Benchmark: The script performs write benchmarks using dd across four thread-count configurations (1, cores÷4, cores÷2, and cores). Each configuration runs N times (configurable, default 2). We use /dev/urandom as our input file, so CPU performance may be relevant. This is by design as /dev/zero is flawed for this purpose, and CPU stress is expected in real-world use anyway. The data is written in 1M chunks to a dataset with a 1M record size. For each thread, 20G of data is written. This scales with the number of threads, so a system with 16 Threads would write 320G of data per iteration.
Pool Read Benchmark: The script performs read benchmarks using dd across the same four thread-count configurations. We are using /dev/null as our output file, so RAM speed may be relevant. The data is read in 1M chunks from a dataset with a 1M record size. For each thread, the previously written 20G of data is read.
DWPD Calculation: After each pool's benchmarks complete, the script calculates Drive Writes Per Day (DWPD) based on total data written, pool capacity, and test duration.

NOTE: ZFS ARC will also be used and will impact your results. This may be undesirable in some circumstances, and the zfs_arc_max can be set to 1 (which means 1 byte) to prevent ARC from caching. Setting it back to 0 will restore the default behavior, but the system will need to be restarted!

I have tested several permutations of file sizes on a dozen systems with varying amount of storage types, space, and RAM. Eventually settled on the current behavior for several reasons. Primarily, I wanted to reduce the impact of, but not REMOVE the ZFS ARC, since in a real world scenario, you would be leveraging the benefits of ARC caching. However, in order to avoid insanely unrealistic results, I needed to use file sizes that saturate the ARC completely. I believe this gives us the best data possible.

Example of arcstat -f time,hit%,dh%,ph%,mh% 10 running while the benchmark is running.

Disk Benchmark: The script performs sequential read benchmarks on individual disks using dd. The read size is calculated as min(system RAM, disk size) to work around ARC caching. Data is read in 4K chunks to /dev/null, making this a 4K sequential read test. 4K was chosen because ashift=12 for all recent ZFS pools created in TrueNAS. The number of iterations is configurable (default 2). Run-to-run variance is expected, particularly on SSDs, as data may end up in internal caches.

Enhanced Disk Benchmark (v2.0)

tn-bench v2.0 introduces an enhanced disk benchmark with multiple test modes and configurable block sizes:

Test Modes:

SERIAL (default): Test disks one at a time
- Best for baseline performance measurements
- Minimal system impact
- Recommended for production systems
PARALLEL: Test all disks simultaneously
- Stress tests storage controllers and backplanes
- Higher resource usage than serial mode
- Useful for identifying controller bottlenecks
SEEK_STRESS: Multiple threads per disk
- Heavy stress on disk seek mechanisms
- Can saturate CPU cores
- May cause system instability on busy systems
- Not recommended for production use

Block Size Options:

4K (small random I/O)
32K (medium I/O)
128K (large sequential)
1M (very large sequential)
Results: The script displays the results for each run and the average speed. This should give you an idea of the impacts of various thread-counts (as a synthetic representation of client-counts) and the ZFS ARC caching mechanism.

NOTE: The script's run duration is dependant on the number of threads in your system as well as the number of disks in your system. Small all-flash systems may complete this benchmark in 25 minutes, while larger systems with spinning hardrives may take several hours. The script will not stop other I/O activity on a production system, but will severely limit performance. This benchmark is best run on a system with no other workload. This will give you the best outcome in terms of the accuracy of the data, in addition to not creating angry users.

Performance Considerations

ARC Behavior

ARC hit rate decreases as working set exceeds cache size, which tn-bench intentionally causes.
Results reflect mixed cache hit/miss scenarios, not neccesarily indicative of a real world workload.

Resource Requirements

Resource Type	Requirement	Notes
Pool Test Space	20 GiB per thread	Space freed between iterations (v2.0+)
Thread Configurations	4 (1, cores÷4, cores÷2, cores)	For ZFS pool benchmarks
Default Iterations	2 per configuration	Configurable 1-100
Disk Serial Mode	Low impact	Default, safe for production
Disk Parallel Mode	Moderate controller load	All disks simultaneously
Disk Seek-Stress Mode	High CPU usage ⚠️	Multiple threads per disk, may saturate CPU

⚠️ Resource Allocation Warnings

SEEK_STRESS Mode:

Uses multiple concurrent threads per disk (4 threads default)
Can saturate all CPU cores
May cause system instability on heavily loaded systems
Not recommended for production systems
Only use on dedicated test systems with no other workloads

PARALLEL Mode:

Tests all disks simultaneously
Heavy load on storage controllers and backplanes
May impact other I/O operations
Use with caution on production systems

SERIAL Mode (Recommended):

Tests one disk at a time
Minimal system impact
Safe for production use
Best for baseline performance measurements

Execution Time

Small all-flash systems: ~10-30 minutes
Large HDD arrays: Several hours or more
Progress indicators: Provided at each stage
Status updates: For each benchmark operation

Cleanup Options

The script provides interactive prompts to delete test datasets after benchmarking. All temporary files are automatically removed.

Delete testing dataset fire/tn-bench? (yes/no): yes
✓ Dataset fire/tn-bench deleted.

UI Enhancement

The script is now colorized and more human readable.

Output Files

python3 truenas-bench.py [--output /root/my_results.json]

tn-bench generates three files for each benchmark run:

File	Suffix	Description
Results	`.json`	Raw benchmark data with system info, pool benchmarks, and disk benchmarks
Analytics	`_analytics.json`	Structured analysis of scaling patterns and per-disk performance
Report	`_report.md`	Human-readable markdown report with tables and observations

Example

python3 truenas-bench.py --output results.json

Generates:

results.json — Raw benchmark data
results_analytics.json — Scaling analysis and disk comparison
results_report.md — Markdown report for sharing

Analytics (v2.1+)

tn-bench automatically analyzes benchmark results to identify scaling patterns and performance characteristics:

What's Analyzed

Thread scaling: How performance changes as thread count increases
Optimization points: Thread count where peak performance occurs
Transition deltas: Speed changes between thread configurations
Per-disk variance: Individual drive performance relative to pool average

Key Metrics

Metric	Description
Peak Speed	Maximum throughput achieved
Optimal Threads	Thread count at peak performance
Thread Efficiency	MB/s per thread at peak
% of Pool Avg	Disk speed relative to pool mean

Sample Analytics Output

{
  "pool_analyses": [{
    "name": "tank",
    "write_scaling": {
      "peak_speed_mbps": 4465.7,
      "optimal_threads": 16,
      "thread_efficiency": 279.1,
      "progression": [...],
      "deltas": [...]
    },
    "read_scaling": { ... },
    "observations": [
      "Speed decreases from 16 to 32 threads"
    ]
  }],
  "disk_comparison": {
    "tank": {
      "pool_average_mbps": 614.5,
      "variance_pct": 0.3,
      "disks": [...]
    }
  }
}

The analytics engine uses neutral data presentation — it reports what it observes without making performance judgments. You draw the conclusions.

Live Telemetry Output (v2.2+)

During benchmark execution, tn-bench collects zpool iostat telemetry and displays detailed per-thread performance statistics in real-time:

Example Telemetry Summary (M50 TrueNAS)

╔══════════════════════════════════════════════════════════╗
║        Zpool Iostat Telemetry Summary for Pool: ice      ║
╚══════════════════════════════════════════════════════════╝
  • Total samples: 1406  |  Steady-state samples: 1287
  • Duration: 1442.23 seconds

────────── Per-Thread-Count Steady-State Analysis ──────────
  WRITE telemetry only (READ excluded due to ZFS ARC cache interference)

  1 Threads (48 samples):
  ┌─ IOPS ────────────────────────────────────────────────────
  │ Mean: 958.4  │ Median: 0.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 4,940.5 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 1,466.3 [High Variance] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 153.0% High Variance │
  └──────────────────────────────────────────────────────────┘
  ┌─ Bandwidth (MB/s) ────────────────────────────────────────
  │ Mean: 307.9  │ Median: 0.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 1,194.2 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 487.5 [Good] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 158.3% High Variance │
  └──────────────────────────────────────────────────────────┘

  10 Threads (100 samples):
  ┌─ IOPS ────────────────────────────────────────────────────
  │ Mean: 6,643.8  │ Median: 6,470.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 11,607.0 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 1,974.5 [High Variance] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 29.7% Variable │
  └──────────────────────────────────────────────────────────┘

  40 Threads (376 samples):
  ┌─ IOPS ────────────────────────────────────────────────────
  │ Mean: 8,003.7  │ Median: 7,855.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 13,925.0 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 2,907.8 [High Variance] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 36.3% High Variance │
  └──────────────────────────────────────────────────────────┘

────────────────────────── Legend ──────────────────────────
  Statistical Measures:
    • Mean:    Average of all samples
    • Median:  Middle value (50th percentile), less affected by outliers
    • P99:     99th percentile - 99% of samples fall below this value
    • Std Dev: Standard deviation - measures spread/consistency
    • CV%:     Coefficient of Variation (std dev / mean × 100)

  CV% Rating (Consistency):
    • Excellent:    < 10%  (highly consistent)
    • Good:         10-20% (good consistency)
    • Variable:     20-30% (some variability)
    • High Variance:  > 30%  (significant inconsistency)

Understanding the Output

Per-Thread Analysis: Each thread count configuration shows:

IOPS: Operations per second with consistency ratings
Bandwidth (MB/s): Throughput with spread analysis
Latency (ms): Response time statistics (P99-rated by speed thresholds)

Why READ telemetry is excluded: ZFS ARC cache artificially inflates read performance numbers, making them misleading. tn-bench reports WRITE telemetry only for accurate pool performance visibility.

ARC Statistics (v2.2+)

tn-bench v2.2 introduces comprehensive ARC (Adaptive Replacement Cache) telemetry using arcstat:

What's Collected

Metric	Description
ARC Hit %	Percentage of reads served from ARC
ARC Size (GiB)	Total ARC memory usage
Demand/Prefetch Hit %	Breakdown of hit types
MRU/MFU Distribution	Cache list balance
L2ARC Hit %	Secondary cache effectiveness (if present)
L2ARC Size (GiB)	L2ARC device capacity
ZFetch Stats	Prefetch engine performance

L2ARC Auto-Detection

Automatically detects L2ARC via zpool status
On systems without L2ARC: L2ARC metrics omitted entirely (no clutter)
On systems with L2ARC: Full L2ARC telemetry collected
Prevents arcstat crashes on non-L2ARC systems

Example ARC Summary

╔══════════════════════════════════════════════════════════╗
║   ARC Statistics Summary (READ Phase) for Pool: inferno  ║
╚══════════════════════════════════════════════════════════╝
  • Total samples: 487  |  Read-phase samples: 132
  • Duration: 486.23 seconds
  • L2ARC: not present (L2ARC metrics omitted)

──────────── Per-Thread-Count READ ARC Analysis ────────────
  ARC cache performance during READ benchmark phases

  1 Threads (4 samples):
  ┌─ ARC Hit % ───────────────────────────────────────────────
  │ Mean: 99.5% [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ Median: 99.9%  │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 0.8 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 0.8% Excellent │
  └──────────────────────────────────────────────────────────┘
  32 Threads (89 samples):
  ┌─ ARC Hit % ───────────────────────────────────────────────
  │ Mean: 57.9% [Poor] │

Rating thresholds:

Excellent: ≥ 95% (nearly all reads from cache)
Good: 85-95% (majority cached)
Variable: 70-85% (moderate caching)
Poor: < 70% (frequent cache misses)

Color Coding (console output):

Green: Excellent ratings
Cyan: Good ratings
Yellow: Variable/Acceptable
Red: High/High Variance

JSON Schema

{
  "schema_version": "1.0",
  "metadata": {
    "start_timestamp": "2025-03-15T14:30:00",
    "end_timestamp": "2025-03-15T15:15:00",
    "duration_minutes": 45.0,
    "benchmark_config": {
      "selected_pools": ["tank", "backups"],
      "disk_benchmark_run": true,
      "zfs_iterations": 2,
      "disk_iterations": 1
    }
  },
  "system": {
    "os_version": "25.04.1",
    "load_average_1m": 0.85,
    "load_average_5m": 1.2,
    "load_average_15m": 1.1,
    "cpu_model": "Intel Xeon Silver 4210",
    "logical_cores": 40,
    "physical_cores": 20,
    "system_product": "TRUENAS-M50",
    "memory_gib": 251.56
  },
  "pools": [
    {
      "name": "tank",
      "path": "/mnt/tank",
      "status": "ONLINE",
      "vdevs": [
        {"name": "raidz2-0", "type": "RAIDZ2", "disk_count": 8}
      ],
      "benchmark": [
        {
          "threads": 1,
          "write_speeds": [205.57, 209.95],
          "average_write_speed": 207.76,
          "read_speeds": [4775.63, 5029.35],
          "average_read_speed": 4902.49,
          "iterations": 2
        },
        {
          "threads": 10,
          "write_speeds": [1850.32, 1823.45],
          "average_write_speed": 1836.89,
          "read_speeds": [15234.56, 14987.23],
          "average_read_speed": 15110.90,
          "iterations": 2
        }
      ],
      "dwpd": 0.15,
      "total_writes_gib": 640.0
    }
  ],
  "disks": [
    {
      "name": "ada0",
      "model": "ST12000VN0008",
      "serial": "ABC123",
      "zfs_guid": "1234567890",
      "pool": "tank",
      "size_gib": 10999.99,
      "benchmark": {
        "speeds": [210.45],
        "average_speed": 210.45,
        "iterations": 1
      }
    }
  ]
}

Example Output (M50 TrueNAS with v2.2 telemetry)


############################################################
#                 tn-bench v2.2 (Modular)                  #
############################################################

TN-Bench is an OpenSource Software Script that uses standard tools to
Benchmark your System and collect various statistical information via
the TrueNAS API.

* TN-Bench will create a Dataset in each of your pools for testing purposes
* that will consume 20 GiB of space for every thread in your system.

! WARNING: This test will make your system EXTREMELY slow during its run.
! WARNING: It is recommended to run this test when no other workloads are running.

* ZFS ARC will impact your results. You can set zfs_arc_max to 1 to prevent ARC caching.
* Setting it back to 0 restores default behavior but requires a system restart.

============================================================
 Confirmation 
============================================================

Would you like to continue? (yes/no): yes

------------------------------------------------------------
|                    System Information                    |
------------------------------------------------------------

Field                 | Value                                     
----------------------+-------------------------------------------
Version               | 25.10.1                                   
Load Average (1m)     | 8.44091796875                             
Load Average (5m)     | 8.38720703125                             
Load Average (15m)    | 9.19482421875                             
Model                 | Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
Cores                 | 40                                        
Physical Cores        | 20                                        
System Product        | TRUENAS-M50-S                             
Physical Memory (GiB) | 251.55                                    

------------------------------------------------------------
|                     Pool Information                     |
------------------------------------------------------------

Field      | Value    
-----------+----------
Name       | fire     
Path       | /mnt/fire
Status     | ONLINE   
VDEV Count | 1        
Disk Count | 4        

VDEV Name  | Type           | Disk Count
-----------+----------------+---------------
raidz1-0    | RAIDZ1         | 4

------------------------------------------------------------
|                     Pool Information                     |
------------------------------------------------------------

Field      | Value   
-----------+---------
Name       | ice     
Path       | /mnt/ice
Status     | ONLINE  
VDEV Count | 5       
Disk Count | 35      

VDEV Name  | Type           | Disk Count
-----------+----------------+---------------
raidz2-0    | RAIDZ2         | 7
raidz2-1    | RAIDZ2         | 7
raidz2-2    | RAIDZ2         | 7
raidz2-3    | RAIDZ2         | 7
raidz2-4    | RAIDZ2         | 7

------------------------------------------------------------
|                     Disk Information                     |
------------------------------------------------------------

* The TrueNAS API returns N/A for the Pool for boot devices and disks not in a pool.
Field      | Value                     
-----------+---------------------------
Name       | sdan                      
Model      | KINGSTON_SA400S37120G     
Serial     | 50026B7784064E49          
ZFS GUID   | None                      
Pool       | N/A                       
Size (GiB) | 111.79                    
-----------+---------------------------
Name       | nvme0n1                   
Model      | INTEL SSDPE2KE016T8       
Serial     | PHLN013100MD1P6AGN        
ZFS GUID   | 17475493647287877073      
Pool       | fire                      
Size (GiB) | 1400.00                   
-----------+---------------------------
Name       | nvme2n1                   
Model      | INTEL SSDPE2KE016T8       
Serial     | PHLN931600FE1P6AGN        
ZFS GUID   | 11275382002255862348      
Pool       | fire                      
Size (GiB) | 1400.00                   
-----------+---------------------------
Name       | nvme3n1                   
Model      | SAMSUNG MZWLL1T6HEHP-00003
Serial     | S3HDNX0KB01220            
ZFS GUID   | 4368323531340162613       
Pool       | fire                      
Size (GiB) | 1399.22                   
-----------+---------------------------
Name       | nvme1n1                   
Model      | SAMSUNG MZWLL1T6HEHP-00003
Serial     | S3HDNX0KB01248            
ZFS GUID   | 3818548647571812337       
Pool       | fire                      
Size (GiB) | 1399.22                   
-----------+---------------------------
Name       | sdo                       
Model      | HUS728T8TAL4204           
Serial     | VAHD4XTL                  
ZFS GUID   | 6447577595542961760       
Pool       | ice                       
Size (GiB) | 7452.04                   
-----------+---------------------------
Name       | sds                       
Model      | HUS728T8TAL4204           
Serial     | VAHE4AJL                  
ZFS GUID   | 11464489017973229028      
Pool       | ice                       
Size (GiB) | 7452.04                   

... (35 total disks)

############################################################
#                      Pool Selection                      #
############################################################

* Available pools:
• 1. fire
• 2. ice
* Options:
• 1. Enter specific pool numbers (comma separated)
• 2. Type 'all' to test all pools
• 3. Type 'none' to skip pool testing

Enter your choice [all]: all

############################################################
#              ZFS Pool Benchmark Iterations               #
############################################################

* How many times should we run each test?
• • Enter any positive integer (1-100, default: 2)
• • Enter 0 to skip this benchmark

Enter iteration count [2]: 1

############################################################
#           Individual Disk Benchmark Iterations           #
############################################################

* How many times should we run each test?
• • Enter any positive integer (1-100, default: 2)
• • Enter 0 to skip this benchmark

Enter iteration count [2]: 0
* Skipping Individual Disk benchmark.

############################################################
#                  DD Benchmark Starting                   #
############################################################

* Using 40 threads for the benchmark.
* ZFS tests will run 1 time(s) per configuration
* Skipping individual disk benchmark

############################################################
#                    Testing Pool: fire                    #
############################################################

* Creating test dataset for pool: fire
✓ Dataset fire/tn-bench created successfully.

============================================================
 Space Verification 
============================================================

* Available space: 2793.50 GiB
* Space required:  800.00 GiB (20 GiB/thread × 40 threads)
* Test iterations: 1 (space freed between iterations)
✓ Sufficient space available - proceeding with benchmarks
* Starting zpool iostat collection for pool 'fire' (interval: 1s)
* Warming up zpool iostat collector (3 samples)...
✓ Zpool iostat collector warmup complete

============================================================
 Testing Pool: fire - Threads: 10 
============================================================

* --- Iteration 1 of 1 ---
* Zpool iostat collector: benchmark phase started
* Zpool iostat collector: segment → 10T-write
* Iteration 1: Writing...
* Iteration 1 write: 2023.22 MB/s
* Zpool iostat collector: segment → 10T-read
* Iteration 1: Reading...
* Iteration 1 read: 6517.87 MB/s
* Space freed after iteration 1

============================================================
 Testing Pool: fire - Threads: 20 
============================================================

* --- Iteration 1 of 1 ---
* Zpool iostat collector: segment → 20T-write
* Iteration 1: Writing...
* Iteration 1 write: 2836.82 MB/s
* Zpool iostat collector: segment → 20T-read
* Iteration 1: Reading...
* Iteration 1 read: 6590.46 MB/s
* Space freed after iteration 1

============================================================
 Testing Pool: fire - Threads: 40 
============================================================

* --- Iteration 1 of 1 ---
* Zpool iostat collector: segment → 40T-write
* Iteration 1: Writing...
* Iteration 1 write: 2813.03 MB/s
* Zpool iostat collector: segment → 40T-read
* Iteration 1: Reading...
* Iteration 1 read: 6628.14 MB/s
* Space freed after iteration 1
* Zpool iostat collector: benchmark phase ended
* Cooling down zpool iostat collector (3 samples)...
✓ Zpool iostat collector cooldown complete
✓ Zpool iostat collection complete: 857 samples

============================================================
 Zpool Iostat Telemetry Summary for Pool: fire 
============================================================


╔══════════════════════════════════════════════════════════╗
║       Zpool Iostat Telemetry Summary for Pool: fire      ║
╚══════════════════════════════════════════════════════════╝
  • Total samples: 857  |  Steady-state samples: 750
  • Duration: 859.54 seconds

────────── Per-Thread-Count Steady-State Analysis ──────────
  WRITE telemetry only (READ excluded due to ZFS ARC cache interference)

  10 Threads (97 samples):
  ┌─ IOPS ────────────────────────────────────────────────────
  │ Mean: 9,851.5  │ Median: 9,880.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 11,424.0 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 889.4 [Variable] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 9.0% Excellent │
  └──────────────────────────────────────────────────────────┘
  ┌─ Bandwidth (MB/s) ────────────────────────────────────────
  │ Mean: 2,667.2  │ Median: 2,680.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 3,030.8 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 248.0 [Good] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 9.3% Excellent │
  └──────────────────────────────────────────────────────────┘
  ┌─ Latency (ms) ────────────────────────────────────────────
  │ Mean: 0.0  │ Median: 0.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 0.0 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 0.0 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 491.5% High Variance │
  └──────────────────────────────────────────────────────────┘

  20 Threads (143 samples):
  ┌─ IOPS ────────────────────────────────────────────────────
  │ Mean: 12,699.0  │ Median: 12,800.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 16,158.0 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 1,454.1 [High Variance] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 11.5%  Good │
  └──────────────────────────────────────────────────────────┘
  ┌─ Bandwidth (MB/s) ────────────────────────────────────────
  │ Mean: 3,698.4  │ Median: 3,830.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 4,055.8 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 368.7 [Good] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 10.0% Excellent │
  └──────────────────────────────────────────────────────────┘
  ┌─ Latency (ms) ────────────────────────────────────────────
  │ Mean: 0.0  │ Median: 0.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 0.0 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 0.0 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 319.0% High Variance │
  └──────────────────────────────────────────────────────────┘

  40 Threads (288 samples):
  ┌─ IOPS ────────────────────────────────────────────────────
  │ Mean: 13,254.2  │ Median: 13,400.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 18,178.0 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 1,991.4 [High Variance] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 15.0%  Good │
  └──────────────────────────────────────────────────────────┘
  ┌─ Bandwidth (MB/s) ────────────────────────────────────────
  │ Mean: 3,680.1  │ Median: 3,860.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 4,050.0 [High] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 410.3 [Good] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 11.2%  Good │
  └──────────────────────────────────────────────────────────┘
  ┌─ Latency (ms) ────────────────────────────────────────────
  │ Mean: 0.0  │ Median: 0.0  │
  ├──────────────────────────────────────────────────────────┤
  │ P99: 0.3 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ Std Dev: 0.0 [Excellent] │
  ├──────────────────────────────────────────────────────────┤
  │ CV%: 449.0% High Variance │
  └──────────────────────────────────────────────────────────┘

────────────────────────── Legend ──────────────────────────
  Statistical Measures:
    • Mean:    Average of all samples
    • Median:  Middle value (50th percentile), less affected by outliers
    • P99:     99th percentile - 99% of samples fall below this value
    • Std Dev: Standard deviation - measures spread/consistency
    • CV%:     Coefficient of Variation (std dev / mean × 100)

  CV% Rating (Consistency):
    • Excellent:    < 10%  (highly consistent)
    • Good:         10-20% (good consistency)
    • Variable:     20-30% (some variability)
    • High Variance:  > 30%  (significant inconsistency)

  P99 Latency Rating (Lower is better):
    • Excellent:    < 10ms   (very fast)
    • Good:         < 50ms   (acceptable)
    • Acceptable:  < 100ms  (may impact workload)
    • High:          > 100ms  (significant latency)

  Std Dev Rating (Consistency - Lower is better):
    • Excellent:    Low spread    (very consistent)
    • Good:         Moderate      (acceptable spread)
    • Variable:     Noticeable    (some spread)
    • High Variance:  Wide spread   (inconsistent)

============================================================
 Pool Write Summary 
============================================================

* Total data written: 1420.00 GiB
* Pool capacity: 5584.00 GiB
* Benchmark duration: 867.60 seconds
* Drive Writes Per Day (DWPD): 25.32
* Cleaning up any remaining test files...

############################################################
#                    Benchmark Complete                    #
############################################################

✓ Total benchmark time: 16.01 minutes

Unattended Mode (v2.3+)

tn-bench v2.3 adds full support for unattended/automated operation. All interactive prompts can be bypassed using CLI arguments, enabling batch testing, CI/CD integration, and scripted runs.

CLI Options

Argument	Description	Values	Required in Unattended
`--unattended`, `--auto`	Enable unattended mode (skip all prompts)	`true` when present	No (but required to bypass prompts)
`--output`	Output JSON file path	Path string	No (default: `./tn_bench_results.json`)
`--pools`	Pool selection	`'all'`, `'none'`, or comma-separated names (e.g., `'fire,ice'`)	Yes
`--zfs-iterations`	ZFS pool benchmark iterations	Integer 0-100 (0 = skip)	Yes
`--pool-block-size`	Pool benchmark block size	`16K`, `32K`, `64K`, `128K`, `256K`, `512K`, `1M`, `2M`, `4M`, `8M`, `16M`	No (default: `1M`)
`--disk-iterations`	Disk benchmark iterations	Integer 0-100 (0 = skip)	Yes
`--disk-modes`	Disk test modes, comma-separated	`serial`, `parallel`, `seek_stress`	No (default: `serial`)
`--disk-block-size`	Disk benchmark block size	`4K`, `32K`, `128K`, `1M`	No (default: `1M`)
`--seek-threads`	Threads per disk for seek_stress mode	Integer 1-32	No (default: `4`)
`--confirm`	Auto-confirm safety prompt	`true` when present	Yes
`--cleanup`	Auto-answer dataset cleanup	`yes` or `no`	No (default: `yes`)

Preset Examples

All pools, no disks

python3 truenas-bench.py --unattended --pools all --zfs-iterations 2 --disk-iterations 0 --confirm

All disks, no pools

python3 truenas-bench.py --unattended --pools none --zfs-iterations 0 --disk-iterations 2 --disk-modes serial --confirm

Burn-in mode

python3 truenas-bench.py --unattended --pools all --zfs-iterations 5 --disk-iterations 3 --disk-modes serial,parallel --confirm

Specific pools with custom block sizes

python3 truenas-bench.py --unattended --pools fire,ice --zfs-iterations 2 --pool-block-size 128K --disk-iterations 1 --disk-block-size 1M --disk-modes serial --confirm

Seek-stress test with cleanup disabled

python3 truenas-bench.py --unattended --pools none --zfs-iterations 0 --disk-iterations 1 --disk-modes seek_stress --seek-threads 8 --cleanup no --confirm

Validation Rules

When --unattended is used:

--confirm is required — safety acknowledgment
--pools is required — specify which pools to test
--zfs-iterations is required — even if set to 0 (skip)
--disk-iterations is required — even if set to 0 (skip)
All other arguments have sensible defaults matching interactive mode
Missing required arguments trigger a helpful error with example usage
Dataset cleanup defaults to yes — temp datasets are deleted unless --cleanup no

Sample Output in Unattended Mode

############################################################
#                 tn-bench v2.3 (Unattended)               #
############################################################

* Mode: UNATTENDED (all prompts skipped)
* Unattended: Selected pools: fire, ice
* Unattended: ZFS pool iterations: 2
* Unattended: Pool block size: 128K
* Unattended: Disk iterations: 1
* Unattended: Disk test modes: serial
* Unattended: Disk block size: 1M
* Unattended: Dataset cleanup: yes

...

Compatibility Notes

Interactive mode remains default — no breaking changes
All existing functionality preserved — unattended is additive
Argument validation mirrors interactive bounds — same 0-100 ranges, same block size options
Error messages guide users — show missing arguments with examples
Default cleanup behavior — datasets auto-deleted in unattended (configurable)

Batch/Matrix Config Mode (v2.3+)

For automated sequential runs with different configurations — such as testing all block sizes on the same pool, comparing multiple pools with the same workload, or running a regression test suite — use --config with a JSON or YAML configuration file.

Quick Start

python3 truenas-bench.py --config batch_block_size_matrix.json --confirm

CLI Argument

Argument	Description
`--config <path>`	Path to JSON or YAML batch config file
`--confirm`	Required safety confirmation
`--output`	Base path for output files (default: `./tn_bench_results.json`)

--config is mutually exclusive with --unattended individual arguments. When --config is used, all run parameters come from the config file.

Config File Schema

Config files have three top-level sections:

Key	Type	Required	Description
`description`	string	No	Human-readable description of this batch
`continue_on_error`	bool	No	If true, continue to next run on failure (default: false)
`global`	object	No	Default settings applied to all runs
`runs`	array	Yes	List of individual benchmark runs

Global Settings

The global object sets defaults for all runs. Any setting in a run overrides the global value.

Key	Type	Default	Description
`pools`	list/string	`["all"]`	Pool selection: list of names, `["all"]`, or `["none"]`
`zfs_iterations`	int	`2`	ZFS benchmark iterations (0-100, 0 = skip)
`pool_block_size`	string	`"1M"`	Block size: 4K, 16K, 32K, 64K, 128K, 256K, 512K, 1M, 2M, 4M, 8M, 16M
`disk_iterations`	int	`0`	Disk benchmark iterations (0-100, 0 = skip)
`disk_modes`	list	`["serial"]`	Disk test modes: serial, parallel, seek_stress
`disk_block_size`	string	`"1M"`	Disk block size: 4K, 32K, 128K, 1M
`seek_threads`	int	`4`	Threads per disk for seek_stress (1-32)
`cleanup`	bool	`true`	Delete test datasets after each run
`verify_cleanup`	bool	`true`	Verify dataset deletion after cleanup
`retry_cleanup`	int	`3`	Max cleanup retry attempts
`force_cleanup`	bool	`false`	Use force delete on cleanup

Run Settings

Each run in the runs array has:

Key	Type	Required	Description
`name`	string	Yes	Unique name for this run (used in filenames)
(any global key)	(same)	No	Override any global setting for this run

Config File Examples

JSON: Block Size Matrix

{
  "description": "Block size matrix test on inferno pool",
  "continue_on_error": true,
  "global": {
    "pools": ["inferno"],
    "cleanup": true,
    "verify_cleanup": true,
    "retry_cleanup": 3,
    "disk_iterations": 0
  },
  "runs": [
    {
      "name": "4K-block-test",
      "zfs_iterations": 2,
      "pool_block_size": "4K"
    },
    {
      "name": "32K-block-test",
      "zfs_iterations": 2,
      "pool_block_size": "32K"
    },
    {
      "name": "128K-block-test",
      "zfs_iterations": 2,
      "pool_block_size": "128K"
    },
    {
      "name": "1M-block-test",
      "zfs_iterations": 2,
      "pool_block_size": "1M"
    }
  ]
}

YAML: Block Size Matrix

description: "Block size matrix test on inferno pool"
continue_on_error: true

global:
  pools:
    - inferno
  cleanup: true
  verify_cleanup: true
  retry_cleanup: 3
  disk_iterations: 0

runs:
  - name: 4K-block-test
    zfs_iterations: 2
    pool_block_size: "4K"

  - name: 32K-block-test
    zfs_iterations: 2
    pool_block_size: "32K"

  - name: 128K-block-test
    zfs_iterations: 2
    pool_block_size: "128K"

  - name: 1M-block-test
    zfs_iterations: 2
    pool_block_size: "1M"

Iteration Scaling Test

{
  "description": "Measure variance reduction with more iterations",
  "continue_on_error": true,
  "global": {
    "pools": ["all"],
    "cleanup": true,
    "pool_block_size": "1M",
    "disk_iterations": 0
  },
  "runs": [
    { "name": "1-iteration", "zfs_iterations": 1 },
    { "name": "2-iterations", "zfs_iterations": 2 },
    { "name": "5-iterations", "zfs_iterations": 5 },
    { "name": "10-iterations", "zfs_iterations": 10 }
  ]
}

Multi-Pool Comparison

{
  "description": "Same config on different pools",
  "continue_on_error": true,
  "global": {
    "cleanup": true,
    "zfs_iterations": 3,
    "pool_block_size": "1M",
    "disk_iterations": 0
  },
  "runs": [
    { "name": "fire-pool", "pools": ["fire"] },
    { "name": "ice-pool", "pools": ["ice"] },
    { "name": "inferno-pool", "pools": ["inferno"] }
  ]
}

Regression Test Suite

{
  "description": "Known good configurations for periodic validation",
  "continue_on_error": false,
  "global": {
    "cleanup": true,
    "verify_cleanup": true,
    "retry_cleanup": 5,
    "force_cleanup": true
  },
  "runs": [
    {
      "name": "baseline-1M-zfs",
      "pools": ["all"],
      "zfs_iterations": 2,
      "pool_block_size": "1M",
      "disk_iterations": 0
    },
    {
      "name": "baseline-disk-serial",
      "pools": ["none"],
      "zfs_iterations": 0,
      "disk_iterations": 2,
      "disk_modes": ["serial"],
      "disk_block_size": "1M"
    }
  ]
}

Output Files

Batch mode generates individual results per run plus an aggregate summary:

File	Description
`tn_bench_results_run1_4K-block-test.json`	Raw results for run 1
`tn_bench_results_run1_4K-block-test_analytics.json`	Analytics for run 1
`tn_bench_results_run1_4K-block-test_report.md`	Report for run 1
`tn_bench_results_run2_32K-block-test.json`	Raw results for run 2
...	(same pattern for each run)
`tn_bench_results_batch_summary.json`	Aggregate summary with comparison

Batch Summary Schema

{
  "description": "Block size matrix test on inferno pool",
  "config_file": "/root/batch_block_size_matrix.json",
  "start_time": "2025-02-08T14:30:00",
  "end_time": "2025-02-08T16:45:00",
  "total_duration_minutes": 135.0,
  "total_runs": 4,
  "successful_runs": 4,
  "failed_runs": 0,
  "system_info": {
    "cpu_model": "Intel Xeon Silver 4114",
    "logical_cores": 40,
    "memory_gib": 251.55
  },
  "runs": [
    {
      "index": 1,
      "name": "4K-block-test",
      "status": "success",
      "config": {
        "pools": ["inferno"],
        "zfs_iterations": 2,
        "pool_block_size": "4K"
      },
      "pool_metrics": {
        "inferno": {
          "peak_write_mbps": 245.3,
          "peak_write_threads": 40,
          "peak_read_mbps": 1234.5,
          "peak_read_threads": 20,
          "dwpd": 5.67,
          "total_writes_gib": 640.0,
          "duration_seconds": 1842.5
        }
      },
      "output_file": "/root/tn_bench_results_run1_4K-block-test.json",
      "duration_seconds": 1850.3
    }
  ]
}

Robust Dataset Cleanup

Batch mode includes hardened dataset cleanup between runs:

Pre-run safety check — if a stale tn-bench dataset exists before creating a new one, it's automatically cleaned up first
Post-run cleanup — dataset is deleted after each run completes
Verification — after deletion, the API is queried to confirm the dataset no longer exists
Retry logic — configurable retries (default 3) with automatic escalation to force delete
Force delete — force_cleanup: true uses forced deletion from the first attempt
Non-blocking — if cleanup ultimately fails after all retries, a warning is logged and the batch continues to the next run

Sample Batch Output

############################################################
#                 tn-bench v2.3 (Modular)                  #
############################################################

============================================================
 Batch Config Mode
============================================================

* Description: Block size matrix test on inferno pool
* Total runs: 4
* Continue on error: true
• Run 1: 4K-block-test — pools=['inferno'], block_size=4K, zfs_iter=2, disk_iter=0
• Run 2: 32K-block-test — pools=['inferno'], block_size=32K, zfs_iter=2, disk_iter=0
• Run 3: 128K-block-test — pools=['inferno'], block_size=128K, zfs_iter=2, disk_iter=0
• Run 4: 1M-block-test — pools=['inferno'], block_size=1M, zfs_iter=2, disk_iter=0

✓ Batch configuration validated — starting runs.

############################################################
#              Run 1 of 4: 4K-block-test                   #
############################################################

* Pools: inferno
* ZFS iterations: 2, Pool block size: 4k
...
✓ Run 1 (4K-block-test) completed successfully.

############################################################
#              Run 2 of 4: 32K-block-test                  #
############################################################

...

############################################################
#                Batch Results Comparison                   #
############################################################

============================================================
 Pool: inferno
============================================================

Run                            Status     Write MB/s   Read MB/s    DWPD     Duration
----------------------------------------------------------------------------------------------
4K-block-test                  success    245.3        1234.5       5.67     1843s
32K-block-test                 success    1023.7       3456.2       12.34    923s
128K-block-test                success    2345.1       5678.3       18.91    612s
1M-block-test                  success    3456.8       6789.4       25.32    487s

############################################################
#                     Batch Complete                        #
############################################################

✓ Total batch time: 64.42 minutes
* Successful: 4 / 4

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or fixes.

License

This project is licensed under the GPLv3 License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
benchmarks		benchmarks
core		core
samples		samples
utils		utils
ARCHITECTURE.md		ARCHITECTURE.md
LICENSE		LICENSE
README.md		README.md
truenas-bench.py		truenas-bench.py

License

nickf1227/tn-bench

Folders and files

Latest commit

History

Repository files navigation

tn-bench v2.2

tn-bench is an OpenSource software script that benchmarks your system and collects various statistical information via the TrueNAS API. It creates a dataset in each of your pools during testing, consuming 20 GiB of space for each thread in your system.

🆕 What's New in v2.2

ARC Statistics Telemetry (arcstat)

Enhanced Zpool Latency Analytics

L2ARC Auto-Detection

Previous: What's New in v2.1

Automatic Analytics

Delta-Based Scaling Analysis

Per-Disk Pool Comparison

Unified Telemetry Formatter

Codebase Audit & Cleanup

Previous: What's New in v2.0

Modular Architecture

Features

Running the Script is a simple git clone

Please note, this script needs to be run as root.

Benchmarking Process

Enhanced Disk Benchmark (v2.0)

Performance Considerations

ARC Behavior

Resource Requirements

⚠️ Resource Allocation Warnings

Execution Time

Cleanup Options

UI Enhancement

The script is now colorized and more human readable.

Output Files

Example

Analytics (v2.1+)

What's Analyzed

Key Metrics

Sample Analytics Output

Live Telemetry Output (v2.2+)

Example Telemetry Summary (M50 TrueNAS)

Understanding the Output

ARC Statistics (v2.2+)

What's Collected

L2ARC Auto-Detection

Example ARC Summary

JSON Schema

Example Output (M50 TrueNAS with v2.2 telemetry)

Unattended Mode (v2.3+)

CLI Options

Preset Examples

All pools, no disks

All disks, no pools

Burn-in mode

Specific pools with custom block sizes

Seek-stress test with cleanup disabled

Validation Rules

Sample Output in Unattended Mode

Compatibility Notes

Batch/Matrix Config Mode (v2.3+)

Quick Start

CLI Argument

Config File Schema

Global Settings

Run Settings

Config File Examples

JSON: Block Size Matrix

YAML: Block Size Matrix

Iteration Scaling Test

Multi-Pool Comparison

Regression Test Suite

Output Files

Batch Summary Schema

Robust Dataset Cleanup

Sample Batch Output

Contributing

License

About

Resources

License

Please note, this script needs to be run as `root`.

Packages