Skip to content

De-Par/IDet

Repository files navigation

IDet

Fast CPU-only ROI Detection Library for Real-Time Pipelines 🚀

C++17 OpenCV OpenMP enabled NUMA aware Meson Tests GTest Linux and macOS

ONNX Runtime CPU DBNet for text SCRFD for face YOLO for cloth

IDet logo

Overview

IDet is a fast, production-oriented CPU-only C++ library for image detection pipelines, built on top of ONNX Runtime. Library supports three modes: text detection (DBNet / DBNet++ / PP-OCR-style models), face detection (SCRFD family) and cloth detection (YOLO family). Key features include tiled inference, polygon NMS, IOBinding (zero per-frame allocations), explicit threading and memory control, and reproducible performance profiles for modern multi-core CPUs.

Why IDet?

Most demo repos optimize for “it runs”. IDet optimizes for:

  • CPU-first deployment
  • reproducible performance experiments
  • low allocation churn
  • controllable threading
  • maintainable C++ integration
  • model-agnostic detector pipelines (within supported output contracts)

This makes IDet suitable for:

  • server-side CPU inference
  • embedded-ish x86 / ARM deployments (when GPU is not available or not desired)
  • benchmarking and systems-level performance tuning
  • integrating detection into larger C++ products

Documentation

Detailed documentation is organized into focused, self-contained pages under docs/:

Page What it covers
Requirements Host dependencies for Linux/macOS and project scope
Build & Install Toolchain profiles, ORT setup, Meson options, build/test/install flow
Model Zoo Supported model families, conversion notes, compatibility rules
Command-line Options Actual idet_app flags, defaults, validation rules
Quick Start Text / face / cloth scripts and direct smoke commands
C++ Integration Guide Blocking API, hot-loop worker, image lifetime, fixed-shape IOBinding
Performance Guide Runtime policy, benchmark output, tuning, IOBinding, tiling and NMS
Troubleshooting & FAQ Common failures, ORT ABI mismatch, FAQ
Doxygen API Reference Generate and browse local API HTML documentation
ACL Execution Provider Use ACL execution provider instead of MLAS

Quick Start

source toolchain/activate.sh
scripts/build.sh force -- -Didet_libtype=shared
scripts/run_tests.sh
scripts/run_idet_text.sh

Build the checked C++ integration examples:

scripts/build.sh force -- -Dbuild_examples=true

Then run, from repository root:

"${BUILD_DIR}/examples/sync_detector"
"${BUILD_DIR}/examples/hot_loop_worker"

sync_detector runs one image and prints every detected quad. hot_loop_worker reads assets/videos/test.yuv as I420 video by default and processes frames asynchronously while the application loop continues its own work.

Highlights

  • High-performance CPU inference (x86 / ARM, Linux & MacOS)
  • 🧠 Multiple detection pipelines: text, face, and cloth detection
  • 🧩 Tiled inference (RxC) with overlap for small-object recall
  • 📐 Polygon-based post-processing with NMS
  • 💾 ONNX Runtime IOBinding for reusable input/output buffers
  • 📈 Benchmark mode with p50 / p90 / p95 / p99 latency
  • 🔒 Accurate logging & error handling: all interaction goes through wrappers
  • 🧵 Explicit threading model:
    • OpenMP → outer parallelism (tiles / batches)
    • ONNX Runtime → intra-op / inter-op graph execution
  • 🔧 Runtime policy controls:
    • affinity / topology-aware execution (where supported)
    • optional NUMA memory locality helpers
    • OpenCV thread suppression to avoid oversubscription

Project Scope

IDet is intentionally a low-level inference toolkit, not a full OCR stack or face recognition framework.

In scope

  • ONNX Runtime-based CPU detector execution
  • pre-processing / post-processing for supported detector families
  • tiled inference and result stitching
  • performance benchmarking and runtime policy control
  • library integration + CLI demo tooling

Out of scope

  • OCR text recognition and language models
  • ROI recognition / embeddings / tracking
  • dataset labeling / training pipelines
  • GUI applications

Credits

This project uses such libraries / frameworks:

  • OpenCV (image data processing, isolated behind adapters in library internals)
  • OpenMP (fast tiled inference)
  • ONNX Runtime (inference core engine)
  • NUMA (cpu/mem binding topology for multi-socket nodes)
  • GTest (test coverage)
  • Indicators (pretty output with progress bar)

Supported model families:

  • DBNet / DBNet++ / PP-OCR (text detection)
  • SCRFD (face detection)
  • YOLO (cloth detection)

Releases

No releases published

Packages

 
 
 

Contributors