Real-time Industrial Defect Inference Detection (C++)

中文 | English

This project is implemented based on the LiMR model and includes the following content

Installation of essential deep learning libraries for C++ (OpenCV, TensorRT, LibTorch, CUDA)
CMakeList and VSCode configuration
Implementation of dual-model TensorRT acceleration
Three preprocessing methods (OpenCV, OpenCV DNN, OpenCV CUDA + kernel function optimization)
FP16 optimization for acceleration
Post-processing using LibTorch in C++
Real-time video processing and inference

📌 Introduction

It is well-known that deploying deep learning models with C++ can be challenging, yet its acceleration capabilities make it the preferred choice for industrial deployment. Unlike common YOLO or MaskRCNN deployment projects, industrial defect detection models take images as both input and output, with edge scenarios requiring maximally compressed computation time. Therefore, the implementation of image preprocessing, model acceleration, and post-processing code in this project faces scarce reference materials. We hope this project provides valuable reference for similar implementations.

Additionally, configuring OpenCV, TensorRT, and other libraries through CMake in VSCode has limited public documentation. This tutorial aims to provide detailed configuration methods to improve development efficiency.

After four optimizations, the end-to-end processing time per image is <30ms, with throughput >30 imgs/s, achieving real-time inference and doubling the inference speed. (Display computation time not included in table below)

Optimization	Preprocessing Time (ms)	Inference Time (ms)	Post-processing Time (ms)	Total Time (ms)
Initial C++ Implementation	11	17	5	33
Preprocessing DNN Acceleration	8	17	5	30
Preprocessing CUDA Acceleration	4	17	5	26
Shallow Copy Optimization	4	17	3	24
FP16 Inference Acceleration	4	8	3	15

Hardware/Software Configuration (Reference)

CPU: Intel(R) Core(TM) i5-12500H CPU @ 2.50GHz
RAM: 16 GB
GPU: NVIDIA GeForce RTX 3050(4GB) Laptop
Windows 11
CUDA==11.6
Other library versions detailed later
(~~Modest configuration should run smoothly on most laptops😂~~)

📌 Quick Start

~~(May not be as quick as expected)~~

Ⅰ Environment Configuration

Due to space limitations, only version numbers are listed here. For detailed installation, refer to Installation Guide.

CMake==3.15.7
Visual Studio == 2019 (MSVC==v142)
CUDA==11.6
OpenCV==4.5.5 (contrib==4.5.5)
TensorRT==10.12.0.36
LibTorch==1.13.0

Ⅱ Modify CMakeLists

Adjust paths in the following code to your actual installation paths. For CMake customization, see CMake Configuration Guide.

cmake

set(OpenCV_DIR "C:/opencv_s/build/install") // Actual install path

set(TRT_DIR "C:/tensorRT/TensorRT-10.12.0.36.Windows.win10.cuda-11.8/TensorRT-10.12.0.36") // Root directory

set(Torch_DIR "~/libtorch/share/cmake/Torch") // Corresponding path

find_package(Torch REQUIRED)

include_directories(~/libtorch/include/torch/csrc/api/include)

include_directories(~/libtorch/include)

link_directories(~/libtorch/lib)

Ⅲ Compile with CMake

Search for CMake in VSCode extensions, install CMake and CMake Tools
Restart VSCode, press ctrl+shift+p, type cmake, select vs2019 amd64 compiler
Open CMakeLists.txt, press ctrl+s to auto-compile, or press ctrl+shift+p, type cmake, and select configure

Debug Configuration

{

"version": "0.2.0",

    "configurations": [

        {

            "name": "(gdb) Launch",

            "type": "cppdbg",

            "request": "launch",

            "program": "${workspaceFolder}/build/Debug/main", // Default CMake output path

            "args": [], // Command line arguments

            "stopAtEntry": false,

            "cwd": "${workspaceFolder}",

            "environment": [],

            "externalConsole": false,

            "MIMode": "gdb",

            "setupCommands": [ // GDB optimization

            { "text": "-enable-pretty-printing", "ignoreFailures": true }

            ]           

        }

    ]

}

Ⅳ Run the Program

Export engine/onnx files from the Python project and place in /input. Without engine files, they will be auto-generated from onnx during runtime.
Place inference video/image files in /input
Modify filenames in main.cpp
Run the program (empty img path triggers video inference; specified path triggers image inference)

📌 Technical Details

Ⅰ Workflow and Data Pipeline

flowchart TD

F[main.cpp]-->A[video_thread.cpp]

A -->|Read video frame| B[pipeline.cpp]

B -->|Call| C[utils.cu Preprocessing]

C -->|Return processed frame| B

B -->|Execute| D[pipeline::inference]

D -->|Return inference result| B

B -->|Return annotated frame| A

A -->|Display result| E[Display Interface]

Ⅱ Technical Highlights

GPU-accelerated Image Preprocessing:

Utilizes cv::cuda with kernel-optimized HWC→CHW conversion
Preprocessing time reduced to 36.4% of CPU-based methods

Shallow Copy Optimization

Activated via CMake flag USE_DIRECT_BLOB=ON
Reduces post-processing time by 20% via avoiding data transfers
Essential for handling multi-output tensor models

FP16 Acceleration

Enabled during engine export (FP32 input/output maintained for compatibility)
Halves model memory footprint and inference time (47.1% of FP32)
Critical for edge devices with limited VRAM (e.g., 4GB GPUs)

Real-time Performance

<30ms end-to-end latency per image (display excluded)
Sustained throughput >30 FPS meets industrial real-time

📌 TODO

Multi-threaded pipeline for parallel preprocessing/inference/post-processing
Simplified GUI interface
......

📌 Acknowledgements

If you find this project helpful, please give a ⭐ on GitHub
Suggestions and PRs are welcome in Issues
Your support fuels continuous improvement!

License

This repository is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
doc		doc
images		images
include		include
src		src
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
README_ch.md		README_ch.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-time Industrial Defect Inference Detection (C++)

This project is implemented based on the LiMR model and includes the following content

📌 Introduction

📌 Quick Start

Ⅰ Environment Configuration

Ⅱ Modify CMakeLists

Ⅲ Compile with CMake

Ⅳ Run the Program

📌 Technical Details

Ⅰ Workflow and Data Pipeline

Ⅱ Technical Highlights

GPU-accelerated Image Preprocessing:

Shallow Copy Optimization

FP16 Acceleration

Real-time Performance

📌 TODO

📌 Acknowledgements

License

About

Uh oh!

Releases

Packages

Languages

License

ShowayLiao/LiMR_cpp

Folders and files

Latest commit

History

Repository files navigation

Real-time Industrial Defect Inference Detection (C++)

This project is implemented based on the LiMR model and includes the following content

📌 Introduction

📌 Quick Start

Ⅰ Environment Configuration

Ⅱ Modify CMakeLists

Ⅲ Compile with CMake

Ⅳ Run the Program

📌 Technical Details

Ⅰ Workflow and Data Pipeline

Ⅱ Technical Highlights

GPU-accelerated Image Preprocessing:

Shallow Copy Optimization

FP16 Acceleration

Real-time Performance

📌 TODO

📌 Acknowledgements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages