`fast-img-proc`: A fast image processing library

A fast and memory-efficient image processing library utilizing parallel programming on CPU implemented in C++, with GPU acceleration capabilities using CUDA. Bindings generated using nanobind provide a Pythonic interface to the library.

The supported image processing operations are:

Grayscale Conversion
Histogram Equalization
Edge Detection (GPU support)
Blur

The supported image formats are:

PNG
JPG

Supported platforms are:

Linux
macOS (coming soon!)

Required Software Packages

C++ (required): compiler that supports C++20
CMake 3.18+ (required) :
- Linux: sudo apt-get -y install cmake
- macOS: brew install cmake
Python 3.7+ (required) :
- Linux: sudo apt install python3-dev
- macOS: brew install python
- [github link]
TBB (required) :
- Linux: sudo apt-get install libtbb-dev
- macOS: brew install tbb
- [github link]
nanobind (required) : included as a git submodule in src/external/nanobind [github link]
CUDA Toolkit & Driver (optional): NVIDIA Installation guide

Installing `fast-img-proc`

git clone --recurse-submodules git@github.com:pavan1011/fast-img-proc
cd fast-img-proc && mkdir build && cd build

Configure default build (without CUDA support)

cmake -S ../ -B .

translates to

cmake -DCMAKE_BUILD_TYPE=Release -S ../ -B .

Creating Python Virtual Environment (optional)

Linux:

# Install python virtual env (if not installed)
sudo apt install python3-virtualenv

# Create build directory
mkdir build && cd build

# Create python virtual environment
python -m venv /path/to/venv
source /path/to/venv/bin/activate

# virtual environment activated

# To deactivate python venv
deactivate

Configure with CUDA support

Requires CUDA compiler installed

cmake -S ../ -B . -DUSE_CUDA=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12/bin/nvcc

NOTE: The paths provided above /usr/local/cuda-12 might be different on your machine. Update them with the paths specific to your CUDA configuration.

Build `fast-img-proc`

cmake --build .

Following the above steps generates fast_image_processing.cpython-<python-version>-<arch>-<platform>.so in /path/to/fast-img-proc/build

Update PYTHONPATH environment variable

export PYTHONPATH=$PYTHONPATH:/path/to/build/directory

List of all Build Flags

-DCMAKE_BUILD_TYPE: provides option to set the following types of builds:
- Debug: shows debug, info, warn, and, error logs
- Release: warn and error logs only
-DUSE_CUDA: optionally enables GPU acceleration for supported image processing algorithms
-DBUILD_DOCUMENTATION: optionally enables detailed documentation generation locally using Doxygen
-DCMAKE_CUDA_COMPILER: path to CUDA compiler. Required if -DUSE_CUDA is set to ON. Usually at /usr/local/cuda-<version>/bin/nvcc
-DCUDA_TOOLKIT_ROOT_DIR: path to CUDA toolkit. Required if -DUSE_CUDA is set to ON. Usually at /usr/local/cuda-<version>
-DPYTHON_EXECUTABLE provides a hint to the CMake build system to help it find a specific version of Python (for virtual environments and non-default python installations).

Usage in Python

More examples along with performance profiling are available in fast_img_proc/scripts.

Below is the basic usage:

#  fast-img-proc exposed as fast_image_processing using nanobind
import fast_image_processing as fip

# If you didn't update the PYTHONPATH to point to the build directory
# Do the following to link the generated .so to your python script
import sys
sys.path.append('/path/to/your/build/directory')

def main():
    # Load an RGB image (PNG or JPG supported)
    # using stbi_load from stb library
    input_image = fip.Image("input.png") 

    #Check if GPU with CUDA available
    print(f"GPU Available: {fip.is_gpu_available()}")
    print(f"Active Hardware: {fip.get_active_hardware()}")

    # Examples using automatic hardware selection

    # Convert to grayscale using automatic hardware selection (default)
    auto_grayscale = fip.grayscale(input_image)
    # Save resultant grayscale image
    auto_grayscale.save("grayscale_auto.png")

    # Apply histogram equalization using automatic hardware selection (default)
    auto_equalize_histogram = fip.equalize_histogram(input_image)
    auto_equalize_histogram.save("blur_auto.png")

    # Apply Gaussian blur using automatic hardware selection (default)
    auto_blur = fip.blur(input_image)
    auto_blur.save("grayscale_auto.png")
    
    # Apply Sobel edge detection using automatic hardware selection (default)

    # Derivative on x-axis, smoothing on y-axis, kernel_size = 5x5
    auto_edge_detect_1_0_5 = edge_detect(input_image, 1, 0, 5, fip.Hardware.CPU)
    auto_edge_detect.save("auto_edge_detect_1_0_5.png")

    # Examples using CPU

    # Convert to grayscale on CPU
    cpu_grayscale = fip.grayscale(input_image, fip.Hardware.CPU)
    cpu_grayscale.save("grayscale_cpu.png")

    # Equalize Histogram of an RGB image on CPU
    cpu_hist_equalized_rgb = fip.equalize_histogram(input_image, fip.Hardware.CPU)
    cpu_hist_equalized_rgb.save("hist_equalized_rgb_cpu.png")

    # Equalize Histogram of a grayscale image on CPU
    cpu_hist_equalized_gray = fip.equalize_histogram(cpu_grayscale, fip.Hardware.CPU)
    cpu_hist_equalized_gray.save("hist_equalized_gray_cpu.png")

    # Edge detection on CPU
    cpu_edge_1_0_5 = fip.edge_detect(input_image, 1, 0, 5, fip.Hardware.CPU)
    cpu_edge_1_0_5.save("edge_1_0_5_cpu.png")

    # Derivatives on y-axis, smoothing on x-axis, kernel_size = 5
    cpu_edge_0_1_5 = fip.edge_detect(input_image, 0, 1, 5, fip.Hardware.CPU)
    cpu_edge_0_1_5.save("edge_0_1_5_cpu.png")

    # Derivatives on x-axis and y axis, kernel_size = 5
    cpu_edge_1_1_5 = fip.edge_detect(input_image, 0, 1, 5, fip.Hardware.CPU)
    cpu_edge_1_1_5.save("edge_0_1_5_cpu.png")

    try:
        # Edge detection on GPU
        # Derivatives on x-axis, smoothing on y-axis, kernel_size = 5x5
        gpu_edge_1_0_5 = fip.edge_detect(input_image, 1, 0, 5, fip.Hardware.GPU)
        gpu_edge_1_0_5.save("edge_1_0_5_gpu.png")

        # Smoothing on x-axis, derivative on y-axis, kernel_size = 5x5
        gpu_edge_0_1_5 = fip.edge_detect(input_image, 0, 1, 5, fip.Hardware.GPU)
        gpu_edge_0_1_5.save("edge_0_1_5_gpu.png")

        # Derivative on x and y axis, kernel_size = 5x5
        gpu_edge_1_1_5 = fip.edge_detect(input_image, 0, 1, 5, fip.Hardware.GPU)
        gpu_edge_1_1_5.save("edge_0_1_5_gpu.png")
    
    except RuntimeError as ex:
        printf(f"GPU processing failed: {ex}")

Testing

The default build disables building tests. However, if you want to enable them to run tests locally you can follow the below instructions.

Install gtest

Linux:

sudo apt-get install libgtest-dev

macOS:

brew install googletest

Install `pytest` Python package

python3 -m pip install pytest

Configure Build with Testing Enabled

cmake -DBUILD_TESTS=ON .. <other CMake options>

Build Tests

cmake --build . --target cpp_tests

# Run all tests
ctest

# Run with verbose output
ctest -V

ctest -R python # Run only Python tests

Detailed Documentation

A detailed version of documentation of the source files, including class and member definitions, function signatures, and other implementation details can be generated locally using this project's source files.

Install Doxygen and graphviz

Linux :

sudo apt-get install doxygen graphviz

macOS:

brew install doxygen graphviz

Configure build to generate docs

cd fast-img-proc && mkdir build_docs && cd build_docs
cmake -S ../ -B . <your-build-flags> -DBUILD_DOCUMENTATION=ON

Build documentation files locally

cmake --build . --target docs

This will generate detailed documentation which can be viewed by opening path/to/build_docs/docs/html/html.index.

Build ALL

cmake -DCMAKE_BUILD_TYPE=<build-type> -DUSE_CUDA=ON -DPYTHON_EXECUTABLE=<path-to-python> -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12/bin/nvcc -DBUILD_TESTS=ON -DBUILD_DOCUMENTATION=ON -S ../ -B .

cmake --build .
cmake --build . --target docs

Benchmark comparing CPU vs GPU Implementation

Sobel Edge Detection

Currently, only Sobel edge detection is supported to run on GPU.

The benchmarking script can be run locally from fast-img-proc/scripts/benchmark_edge_detect.py as follows:

cd fast-img-proc
mkdir benchmark_results
python3 ./scripts/benchmark_edge_detect.py <path/to/input_images> benchmark_results

This runs sobel edge detection and stores resulting images in benchmark_results and benchmark_results.csv in the current directory.

Results suggest 3-5X improvement in runtimes when comparing GPU runtimes with CPU on larger images (1+ MB).

Benchmark Summary

After benchmarking on images of different dimensions (w x h).

Mean Speedup on GPU vs CPU by Kernel Size:

kernel_size	Speedup factor
3	3.813
5	3.969
7	4.869

Mean Speedup by Image Size::

Image Dims	Image Size	Speedup factor
960 x 640	38 KB	1.091
2048 x 2048	3.6 MB	3.340
2400 x 2400	6 MB	4.060
3600 x 3600	49 KB (gray)	2.562
6200 x 6200	26.7 MB	4.264
9393 x 4270	27.3 MB	4.868
11472 x 6429	93 MB	4.527

Credits

Image Loading and Saving

The stb library from https://github.com/nothings/stb (MIT and Public Domain licenses) was used to populate fast-img-proc/external/stb.

stb_image.h: used to load images and represent them as buffers for further processing.
stb_image_write.h: used to save images after processing.

Python Bindings

The nanobind library from https://github.com/wjakob/nanobind (BSD-3-Clause license) was used to generate pythonic bindings to fast-img-proc C++ library.

Detailed Documentation

The detailed documentation is generated locally using Doyxgen: https://www.doxygen.nl/index.html.

The graphviz library from https://github.com/graphp/graphviz (MIT license) was used to generate dependency diagrams from this project's source files.

The doxygen-awesome-css library from https://github.com/jothepro/doxygen-awesome-css (MIT license) was used for custom styling in this documentation, namely:

fast-img-proc/docs/doxygen-awesome-sidebar-only.css
fast-img-proc/docs/doxygen-awesome.css

My special thanks to the authors and contributors of all the above libraries.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
docs		docs
external		external
include		include
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE.md		LICENSE.md
README.md		README.md
bindings.cpp		bindings.cpp
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

`fast-img-proc`: A fast image processing library

Required Software Packages

Installing `fast-img-proc`

Configure default build (without CUDA support)

Creating Python Virtual Environment (optional)

Configure with CUDA support

Build `fast-img-proc`

Update PYTHONPATH environment variable

List of all Build Flags

Usage in Python

Testing

Install gtest

Install `pytest` Python package

Configure Build with Testing Enabled

Build Tests

Detailed Documentation

Install Doxygen and graphviz

Configure build to generate docs

Build documentation files locally

Build ALL

Benchmark comparing CPU vs GPU Implementation

Sobel Edge Detection

Benchmark Summary

Credits

Image Loading and Saving

Python Bindings

Detailed Documentation

About

Uh oh!

Releases

Packages

Languages

License

pavan1011/fast-img-proc

Folders and files

Latest commit

History

Repository files navigation

fast-img-proc: A fast image processing library

Required Software Packages

Installing fast-img-proc

Configure default build (without CUDA support)

Creating Python Virtual Environment (optional)

Configure with CUDA support

Build fast-img-proc

Update PYTHONPATH environment variable

List of all Build Flags

Usage in Python

Testing

Install gtest

Install pytest Python package

Configure Build with Testing Enabled

Build Tests

Detailed Documentation

Install Doxygen and graphviz

Configure build to generate docs

Build documentation files locally

Build ALL

Benchmark comparing CPU vs GPU Implementation

Sobel Edge Detection

Benchmark Summary

Credits

Image Loading and Saving

Python Bindings

Detailed Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`fast-img-proc`: A fast image processing library

Installing `fast-img-proc`

Build `fast-img-proc`

Install `pytest` Python package

Packages