A benchmarking framework for evaluating compression algorithms on scientific data arrays.
Latest benchmark results: https://magland.github.io/benchcompress/
Paper (WIP): https://magland.github.io/benchcompress/paper
Benchcompress is a comprehensive benchmarking framework for evaluating compression algorithms on scientific data arrays. The system follows an automated workflow:
-
Defining Components
- Algorithms are implemented in
benchcompress/src/benchcompress/algorithms/ - Datasets are defined in
benchcompress/src/benchcompress/datasets/ - Each component specifies metadata like version, tags, and compatibility requirements
- Algorithms are implemented in
-
Automated Benchmarking
- Benchmarks run automatically via GitHub Actions on pushes to main branch
- For each compatible algorithm-dataset pair, measures:
- Compression ratio
- Encoding throughput (MB/s)
- Decoding throughput (MB/s)
- Results are verified by decompressing and comparing with original data
-
Result Storage
- Results are committed to a dedicated
benchmark-resultsbranch - Local and remote caching system prevents redundant rerunning of benchmarks (only modified or added components are re-benchmarked)
- Caching is based on algorithm and dataset versions
- Results are committed to a dedicated
-
Web Interface
- Interactive visualization at https://magland.github.io/benchcompress/
- Filter and sort results by dataset or algorithm
- Visual charts for comparing performance metrics
- Export results to CSV for further analysis
The project consists of two main components:
benchcompress/: Python package containing the core benchmarking framework, algorithms, and datasetsweb-ui/: React-based web interface for visualizing benchmark results
- Install Python dependencies:
# You may need to first install wavpack
# apt-get install libwavpack-dev
cd benchcompress
pip install -e .
benchcompress --help
benchcompress list
benchcompress run --help- Install web UI dependencies:
cd web-ui
npm install- Run web UI locally:
cd web-ui
npm run devThis project uses pre-commit hooks to automatically check format code before each commit. The formatting includes:
- Python code formatting using black
- TypeScript/JavaScript code formatting using npm scripts
- C++ code formatting using clang-format
To set up the pre-commit hooks after cloning the repository:
- Install pre-commit:
pip install pre-commit- Install the git hook scripts:
pre-commit installAfter this setup, code will be automatically checked for formatting when you make a commit.
Running ./devel/format_code.sh which will format all code in the repository.