This repository provides a Python script to benchmark different DICOM compression formats and their impact on size and conversion speed.
It uses gdcmconv to compress and decompress DICOMs, zipfile for system-level compression, and dcm2niix to measure NIfTI conversion speed for each format.
The script automatically:
- Creates or clears working folders (
raw,j2k,jpegls,jpeg,zip,out,dcm2niix_out, andzip_extract). - Compresses DICOMs from the
in/directory usinggdcmconv:- raw – uncompressed baseline
- j2k – JPEG 2000
- jpegls – JPEG-LS (lossless)
- jpeg – baseline JPEG
- Measures:
- Total compressed size
- Total compression time
- Total decompression time
- Benchmarks ZIP compression of the entire input folder.
- Runs dcm2niix on each format to test how long it takes to convert DICOMs → NIfTI (
-z nfor uncompressed NIfTI). - Outputs a summary table (
results.md) in Markdown format.
You need the following binaries on your system PATH:
Python ≥ 3.8
No external dependencies beyond the standard library (os, shutil, time, subprocess, zipfile, pathlib, sys).
-
Prepare input data
Place your
.dcmfiles inside a folder namedin(next to the script). The results presented here is from DICOM files available for free on OSF. Just put the DICOM files with the.dcmextension in the folder namedin:./in/ ├── 42.dcm ├── 43.dcm └── ...
-
Run the benchmark
python3 benchmark_dicom_compression.py
or (if you renamed it):
python3 dcm_bench.py
-
Check the results
After completion, the script generates
results.md, containing a Markdown table summarizing compression ratios and timings.
Here are the results for 497 files with a total size of 139.2 MB available from OSF
| format | size (ratio) | compress_time (s) | decompress_time (s) | dcm2niix_time (s) |
|---|---|---|---|---|
| raw | 1.000 (139.2 MB) | 5.633 | 5.601 | 0.299 |
| j2k | 0.603 (83.9 MB) | 8.521 | 6.976 | 2.906 |
| jpegls | 0.599 (83.4 MB) | 6.283 | 6.146 | 0.891 |
| jpeg | 0.638 (88.8 MB) | 6.040 | 5.948 | 0.598 |
| zip | 0.365 (50.8 MB) | 3.524 | 0.190 | 0.302 |
- Compression ratio is reported as total compressed size / total raw size (e.g., 0.5 = half the original size).
- Times are total elapsed seconds across all files in the test folder.
- The script overwrites existing output folders on each run.
- If
dcm2niixis not installed, the benchmark will still run but skip that column. - You can easily extend the script to test additional codecs (e.g., JPEG 2000 HT, JPEG XR, RLE, etc.) by editing the
fmt_flagsdictionary - gdcmconv does not yet support the emerging high-throughput JPEG 2000 HT.
- The zip archive outperforms image compression for size as it also compresses the DICOM headers. It is also very fast as it does not need to parse the DICOM headers. It is a good choice for archiving data, but may not be appropriate if you wish to store your data on a PACS and search DICOM tags from a database.
- This script complements my earlier work on this topic.