Impalab is a language-agnostic framework for orchestrating micro-benchmarks. It allows you to define, build, and run benchmark components written in any language, piping data from a generator to one or more algorithm implementations.
This design makes it simple to perform both:
- Inter-language benchmarking: Compare the performance of the same algorithm (e.g.,
linear_search) in Zig vs. Python. - Intra-language benchmarking: Compare the performance of different algorithms (e.g.,
linear_searchvs.binary_search) within the same language.
The core of Impalab is the impa CLI, a Rust-based orchestrator that manages two types of components:
- Generators: Programs that generate test data (e.g., random numbers, strings) and print it to
stdout. - Algorithms: Programs that read data from
stdin, run one or more named functions against it, and print performance results tostdout.
Impalab works by decoupling data generation from algorithm execution. You define each component in its own directory with a simple impafile.toml that tells Impalab how to build and run it.
- Define: You create an
impafile.tomlfor eachgeneratororalgorithmcomponent. - Build: You run
impa build, which finds allimpafile.tomlfiles, executes their optional[build]steps, and registers the component's[run]command in aimpa_manifest.json. - Run: You run
impa run, specifying which generator to use and which algorithms to test.impahandles spawning processes, pipingstdoutfrom the generator to thestdinof each algorithm, and collecting the results.
The impafile.toml defines the component's name, type, and—most importantly—how to build and run it. The [run] block is the key, as impa will execute the command (assuming it's in the PATH or a relative path) and pass the args.
Example 1: Algorithm (Compiled - Zig)
This component has a [build] step to create a binary, and the [run] command points to the resulting executable.
my_zig_algos/impafile.toml:
name = "zig-algos"
type = "algorithm"
language = "zig"
[build]
command = "zig"
args = ["build-exe", "main.zig", "--name", "run_zig", "-O", "ReleaseSmall"]
# 'impa' will execute './run_zig' from this directory
[run]
command = "./run_zig"Example 2: Algorithm (Interpreted - Python)
This component has no [build] step. The [run] command calls the python3 interpreter (which impa assumes is in the PATH) and passes the script name as an argument.
my_python_algos/impafile.toml:
name = "python-algos"
type = "algorithm"
language = "python"
# No [build] step needed!
[run]
command = "python3"
args = ["main.py"]Example 3: Generator (TypeScript - Deno)
This component uses deno run (assuming it's in the PATH) to execute the generator script.
search_gen_deno/impafile.toml:
name = "search-ints-deno"
type = "generator"
[run]
command = "deno"
args = ["run", "--allow-read", "main.ts"]To work with Impalab, your component executables must follow a simple interface.
- Must accept a
--seed=<u64>argument, which will be provided byimpa. This ensures that the exact same data is generated for each run, allowing for fair comparisons when testing across different languages. - May accept any number of custom arguments, which are passed through by the
impa runcommand. These are used to control the characteristics of the test data (e.g.,--size=10000). - Must print its generated data to
stdout. Each line represents a single test case, starting with a uniqueid. stderrwill be captured and forwarded byimpafor logging.
Example Output (from the TypeScript generator):
run_1 8 10 5 3 8 1
run_2 4 9 2 7 4 6
In this convention, each line is a test case.
run_1is the unique ID.8is the "needle" to search for.10 5 3 8 1is the "haystack" to search in.
- Must accept a
--functions=<list>argument (e.g.,--functions=linear_search,binary_search). - Must read test cases line-by-line from
stdin. - Must understand the data format from the generator (e.g., parse the ID, "needle", and "haystack" from each line).
- Must print results to
stdoutin a simple CSV format:id,function_name,duration_nanos. Theidmust match the one received from the generator. stderrwill be captured and forwarded byimpafor logging.
Example Output (from the Zig algorithm): (This output corresponds to the generator input above)
run_1,linear_search,450
run_1,binary_search,30
run_2,linear_search,455
run_2,binary_search,31
A typical benchmark project might look like this, with component directories placed in the project root:
my_benchmarks/
├── zig_algos/
│ ├── impafile.toml
│ └── main.zig # Implements linear_search, binary_search
├── python_algos/
│ ├── impafile.toml
│ └── main.py # Implements linear_search_py
├── search_ints_gen_deno/
│ ├── impafile.toml
│ ├── main.ts # Generates lines of "id needle haystack..."
│ └── deno.json
│
└── impa_manifest.json (This file will be generated)
First, run the build command from the project's root directory.
impa buildThis command will find all impafile.toml files, execute their [build] steps, and create impa_manifest.json mapping component names and languages to their [run] commands. Note: This manifest is a build artifact and should typically be added to your .gitignore file.
Now, run the benchmarks. This command will use the search-ints-deno generator, pipe its output to both the zig and python algorithm executables, and pass the --size 10000 argument to the generator.
impa run \
--generator "search-ints-deno" \
--algorithms '{"zig": ["linear_search", "binary_search"], "python": ["linear_search_py"]}' \
-- \
--size 10000--generator "search-ints-deno": Use the generator named in itsimpafile.toml.--algorithms '...': A JSON map oflanguageto a list of function names to run.--: All arguments after this are passed directly to the generator (search-ints-deno).
Notice how this single command performs both intra-language benchmarking (comparing linear_search vs. binary_search for the "zig" language) and inter-language benchmarking (comparing the Zig linear_search against Python's linear_search_py).
If an algorithm doesn't require generated data (e.g., calculating Fibonacci), you can use generator = "none".
impa run \
--generator "none" \
--algorithms '{"zig": ["fib_recursive", "fib_iterative"]}'In this mode, the algorithm's stdin is connected to /dev/null.
impa captures the id,func,duration CSV output from all algorithm components and prints it to its own stdout as structured, newline-delimited JSON (JSONL).
{"id":"run_1","language":"zig","function_name":"linear_search","duration":450}
{"id":"run_1","language":"zig","function_name":"binary_search","duration":30}
{"id":"run_1","language":"python","function_name":"linear_search_py","duration":52000}
{"id":"run_2","language":"zig","function_name":"linear_search","duration":455}
{"id":"run_2","language":"zig","function_name":"binary_search","duration":31}
{"id":"run_2","language":"python","function_name":"linear_search_py","duration":52150}This JSONL format is designed for easy consumption. While you can pipe it to tools like jq for quick queries, the intended use case is to parse it in a data analysis environment. For example, you can easily load the output into a Jupyter notebook, parse each line, and build a pandas.DataFrame for sophisticated analysis and visualization.
Scans for impafile.toml files, runs their build commands, and creates a JSON manifest.
--components-dir <PATH>: The root directory containing component subdirectories. (Default:.)--manifest-path <PATH>: The output path for the build manifest. (Default:impa_manifest.json)
Runs the benchmark using the specified components and manifest.
Key Arguments:
--algorithms <JSON_STRING>: (Required) A JSON string mapping languages to a list of function names to run.- Example:
'{"zig": ["linear_search", "binary_search"], "python": ["linear_search_py"]}'
- Example:
--generator <NAME>: (Required) The name of the generator component to use (must match a name in the manifest), ornonefor self-contained algorithms.--manifest-path <PATH>: Path to the build manifest. (Default:impa_manifest.json)--seed <u64>: (Optional) A specific seed for the random number generator.[generator_args]...: Any arguments after--are passed directly to the generator executable.
Override Arguments: You can bypass the manifest file by providing direct paths to executables:
--generator-override-path <PATH>: Use this executable for the generator.--algorithm-override-paths <JSON_MAP>: A JSON string mapping a language to a specific executable path.- Example:
'{"zig": "./my_zig_exe", "python": "./main.py"}'
- Example:
Logging is configured via environment variables:
RUST_LOG: Sets the log level (e.g.,RUST_LOG=info,RUST_LOG=debug). Defaults toinfo.BENCH_LOG_FILE: If set, logs are written to this file instead ofstderr.
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.
Contributions are welcome! Please feel free to open an issue or submit a pull request.