Benchmarking RTSP video stream processing in Rust and Python. This project decodes RTSP streams, applies grayscale conversion and convolution, and compares performance in terms of FPS, CPU usage, and frame processing time.
This repository contains a comparative benchmark of RTSP video stream processing in Rust and Python. It decodes RTSP streams, applies grayscale conversion and convolution, and measures performance metrics such as FPS, frame processing time, and dropped frames.
rust_bench/
├── Cargo.lock
├── Cargo.toml
├── src/
│ └── main.rs # Rust benchmark code
├── target/
│ └── release/
│ ├── rust_bench # Compiled Rust binary
│ ├── deps/ # Dependency files
│ └── other build files
├── test.raw # Sample raw output (optional)
└── rtsp_urls.py # Python benchmark code
src/main.rs: Rust implementation of the RTSP streaming benchmark.rtsp_urls.py: Python implementation of the same benchmark.Cargo.toml&Cargo.lock: Rust project configuration and dependency lock files.target/release: Contains compiled Rust binaries and build artifacts.test.raw: Optional file for testing raw video output.
Both scripts perform the following operations:
-
RTSP Stream Capture:
Connect to a local RTSP server (rtsp://127.0.0.1:8554/stream1) using FFmpeg and decode the stream into raw RGB24 frames. -
Frame Processing:
- Convert each frame to grayscale.
- Apply a 3x3 convolution filter (Sobel-like) multiple times to simulate a CPU-heavy workload (
CONV_ITERS = 4).
-
Metrics Collected:
- Frames processed: Total frames successfully processed.
- Dropped frames: Frames that could not be fully read.
- FPS: Frames per second (throughput).
- Average processing time per frame.
-
Duration: Both benchmarks run for 30 seconds.
- Uses
subprocessto spawn an FFmpeg process that outputs raw RGB24 frames. - Processes each frame using NumPy:
- Reshape raw bytes into
(HEIGHT, WIDTH, 3). - Convert to grayscale using
0.299*R + 0.587*G + 0.114*B. - Apply convolution repeatedly to simulate CPU load.
- Reshape raw bytes into
- Measures FPS and processing time per frame.
- Handles dropped frames if the stream hiccups.
- Uses Rust's
std::process::Commandto spawn FFmpeg. - Reads frames directly into a byte buffer (
Vec<u8>). - Frame processing is done manually in Rust without external libraries:
- Grayscale conversion using the same formula as Python.
- 3x3 convolution applied repeatedly to match CPU workload.
- Uses
Instantfor high-resolution timing and calculates FPS and average frame processing time.
| Language | Frames Processed | Dropped Frames | FPS | Avg Processing Time/Frame |
|---|---|---|---|---|
| Python | 15 | 0 | 0.48 | 1.8918 s |
| Rust | 836 | 0 | 27.86 | 0.0358 s |
Observations:
- Rust achieves ~58x higher FPS than Python in the same workload.
- Average frame processing time in Rust is drastically lower due to compiled code and zero-cost abstractions.
- Python is limited by:
- NumPy overhead.
- GIL (Global Interpreter Lock) preventing full CPU utilization.
- Interpreted execution.
Delays in Python:
- CPU-bound convolution in NumPy is slower than Rust’s manual loop.
- Python has additional overhead for managing subprocess I/O and memory.
- Sleeping for dropped frames adds minor latency but is negligible compared to processing time.
python3 rtsp_urls.pycargo run --releaseRust must be compiled in --release mode for optimal performance. Debug mode will yield much slower FPS.
- Rust is significantly faster for real-time frame processing compared to Python.
- For CPU-heavy video processing tasks, Rust can achieve near real-time performance with the same algorithm.
- Python is easier to prototype but not ideal for high-throughput streaming pipelines.
- Multi-threaded processing for both Rust and Python.
- GPU acceleration using OpenCV or Vulkan/OpenCL.
- Streaming over network to multiple clients for stress testing.
- Comparing memory usage in addition to CPU and FPS.
This repository can serve as a reference for:
- Developers deciding between Rust and Python for video processing.
- Students learning about RTSP streaming and benchmarking.
- Performance engineers evaluating high-throughput stream processing.
MIT License – free to use and modify.