Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
celerity_lat_results.csv	celerity_lat_results.csv
celerity_thr_results.csv	celerity_thr_results.csv
libzmq_lat_results.csv	libzmq_lat_results.csv
libzmq_thr_results.csv	libzmq_thr_results.csv
run_lat.sh	run_lat.sh
run_thr.sh	run_thr.sh

Performance Notes

This directory is for repeatable benchmark workflow around Celerity.

The benchmark binaries themselves stay in src/bin/ so Cargo can build and run them normally:

local_lat
remote_lat
local_thr
remote_thr

Goal

The point of these benchmarks is not to produce flattering numbers. The point is to make it easy to compare Celerity and libzmq under the same conditions.

For throughput, the intended comparison is:

Celerity local_thr / remote_thr
libzmq local_thr / remote_thr

For latency, the intended comparison is:

Celerity local_lat / remote_lat
libzmq local_lat / remote_lat

Run the receiver first, then the sender.

Recommended Sizes

Use the same message count for each size unless you have a reason to change it.

1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536

The current defaults used in this repo are:

Throughput: 1000000 messages
Latency: 10000 round trips

Celerity Throughput

Terminal 1:

cargo run --release --features tokio --bin local_thr -- tcp://127.0.0.1:5555 1024 1000000

Terminal 2:

cargo run --release --features tokio --bin remote_thr -- tcp://127.0.0.1:5555 1024 1000000

local_thr is the side that reports the benchmark result. remote_thr just pushes the traffic.

Celerity Latency

Terminal 1:

cargo run --release --features tokio --bin local_lat -- tcp://127.0.0.1:5555 1024 10000

Terminal 2:

cargo run --release --features tokio --bin remote_lat -- tcp://127.0.0.1:5555 1024 10000

remote_lat is the side that reports the benchmark result. It performs one warmup round trip and then times the remaining request/reply loop.

libzmq Throughput

Terminal 1:

./perf/local_thr tcp://127.0.0.1:5555 1024 1000000

Terminal 2:

./perf/remote_thr tcp://127.0.0.1:5555 1024 1000000

Current Baseline

The tables below record the current matched-count loopback comparison between:

Celerity from this repo
libzmq built locally

These results were collected with:

Throughput: COUNT=1000000 ./perf/run_thr.sh
Latency: COUNT=10000 ./perf/run_lat.sh
Matching libzmq perf binaries with the same endpoint, sizes, and counts

The tables below show the key comparison sizes used for quick baseline checks.

Latency Baseline

Size	libzmq	Celerity	Gap
1B	58.96 us	103.73 us	1.76x
64B	53.62 us	88.36 us	1.65x
1KB	56.14 us	98.83 us	1.76x
64KB	140.83 us	197.30 us	1.40x

Throughput Baseline

Size	libzmq	Celerity	Gap
1B	4,393,364 msg/s	865,874 msg/s	5.07x
64B	3,564,719 msg/s	787,369 msg/s	4.53x
1KB	896,963 msg/s	738,564 msg/s	1.21x
4KB	247,479 msg/s	411,541 msg/s	Celerity 1.66x faster
64KB	11,619 msg/s	23,519 msg/s	Celerity 2.02x faster

Notes:

Small-message throughput is still where libzmq leads most clearly.
At 4KB and 64KB, the current Celerity transport path is faster than the local libzmq build on this machine.
These numbers are machine-specific and should be treated as a baseline, not a guarantee.

Suggested Method

Use --release.
Run on the same machine for both Celerity and libzmq.
Keep the endpoint, message size, and message count identical.
Run each case at least 3 times.
Compare median-to-median or best-of-3 to best-of-3, but stay consistent.
Record the receiver-side throughput, not the sender-side timing.
Record latency from the requester side, not the responder side.

Result Template

Throughput

implementation	transport	size (bytes)	count	msg/s	MiB/s	notes
celerity	tcp loopback	1024	100000
libzmq	tcp loopback	1024	100000

Latency

implementation	transport	size (bytes)	round trips	avg latency (us)	notes
celerity	tcp loopback	1024	100000
libzmq	tcp loopback	1024	100000

Automation

There is a helper script in this directory:

./perf/run_thr.sh

It runs the Celerity throughput pair for the default size grid, prints a compact summary, and writes:

perf/celerity_thr_results.csv

There is also a latency runner:

./perf/run_lat.sh

It prints the latency summary and writes:

perf/celerity_lat_results.csv

Both scripts accept CSV_PATH=... if you want to override the output location.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Performance Notes

Goal

Recommended Sizes

Celerity Throughput

Celerity Latency

libzmq Throughput

Current Baseline

Latency Baseline

Throughput Baseline

Suggested Method

Result Template

Throughput

Latency

Automation

FilesExpand file tree

perf

Directory actions

More options

Directory actions

More options

Latest commit

History

perf

Folders and files

parent directory

README.md

Performance Notes

Goal

Recommended Sizes

Celerity Throughput

Celerity Latency

libzmq Throughput

Current Baseline

Latency Baseline

Throughput Baseline

Suggested Method

Result Template

Throughput

Latency

Automation