Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 57 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,22 @@

---

## 🗺️ Roadmap

- [x] **Skill architecture** — pluggable `SKILL.md` interface for all capabilities
- [x] **Skill Store UI** — browse, install, and configure skills from Aegis
- [x] **AI/LLM-assisted skill installation** — community-contributed skills installed and configured via AI agent
- [x] **GPU / NPU / CPU (AIPC) aware installation** — auto-detect hardware, install matching frameworks, convert models to optimal format
- [x] **Hardware environment layer** — shared [`env_config.py`](skills/lib/env_config.py) for auto-detection + model optimization across NVIDIA, AMD, Apple Silicon, Intel, and CPU
- [ ] **Skill development** — 18 skills across 9 categories, actively expanding with community contributions

## 🧩 Skill Catalog

Each skill is a self-contained module with its own model, parameters, and [communication protocol](docs/skill-development.md). See the [Skill Development Guide](docs/skill-development.md) and [Platform Parameters](docs/skill-params.md) to build your own.

| Category | Skill | What It Does | Status |
|----------|-------|--------------|:------:|
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class object detection | ✅|
| **Detection** | [`yolo-detection-2026`](skills/detection/yolo-detection-2026/) | Real-time 80+ class detection — auto-accelerated via TensorRT / CoreML / OpenVINO / ONNX | ✅|
| | [`dinov3-grounding`](skills/detection/dinov3-grounding/) | Open-vocabulary detection — describe what to find | 📐 |
| | [`person-recognition`](skills/detection/person-recognition/) | Re-identify individuals across cameras | 📐 |
| **Analysis** | [`home-security-benchmark`](skills/analysis/home-security-benchmark/) | [131-test evaluation suite](#-homesec-bench--how-secure-is-your-local-ai) for LLM & VLM security performance | ✅ |
Expand All @@ -48,13 +57,6 @@ Each skill is a self-contained module with its own model, parameters, and [commu

> **Registry:** All skills are indexed in [`skills.json`](skills.json) for programmatic discovery.

### 🗺️ Roadmap

- [x] **Skill architecture** — pluggable `SKILL.md` interface for all capabilities
- [x] **Full skill catalog** — 18 skills across 9 categories with working scripts
- [ ] **Skill Store UI** — browse, install, and configure skills from Aegis
- [ ] **Custom skill packaging** — community-contributed skills via GitHub
- [ ] **GPU-optimized containers** — one-click Docker deployment per skill

## 🚀 Getting Started with [SharpAI Aegis](https://www.sharpai.org)

Expand Down Expand Up @@ -89,6 +91,53 @@ The easiest way to run DeepCamera's AI skills. Aegis connects everything — cam
</table>


## 🎯 YOLO 2026 — Real-Time Object Detection

State-of-the-art detection running locally on **any hardware**, fully integrated as a [DeepCamera skill](skills/detection/yolo-detection-2026/).

### YOLO26 Models

YOLO26 (Jan 2026) eliminates NMS and DFL for cleaner exports and lower latency. Pick the size that fits your hardware:

| Model | Params | Latency (optimized) | Use Case |
|-------|--------|:-------------------:|----------|
| **yolo26n** (nano) | 2.6M | ~2ms | Edge devices, real-time on CPU |
| **yolo26s** (small) | 11.2M | ~5ms | Balanced speed & accuracy |
| **yolo26m** (medium) | 25.4M | ~12ms | Accuracy-focused |
| **yolo26l** (large) | 52.3M | ~25ms | Maximum detection quality |

All models detect **80+ COCO classes**: people, vehicles, animals, everyday objects.

### Hardware Acceleration

The shared [`env_config.py`](skills/lib/env_config.py) **auto-detects your GPU** and converts the model to the fastest native format — zero manual setup:

| Your Hardware | Optimized Format | Runtime | Speedup vs PyTorch |
|---------------|-----------------|---------|:------------------:|
| **NVIDIA GPU** (RTX, Jetson) | TensorRT `.engine` | CUDA | **3-5x** |
| **Apple Silicon** (M1–M4) | CoreML `.mlpackage` | ANE + GPU | **~2x** |
| **Intel** (CPU, iGPU, NPU) | OpenVINO IR `.xml` | OpenVINO | **2-3x** |
| **AMD GPU** (RX, MI) | ONNX Runtime | ROCm | **1.5-2x** |
| **Any CPU** | ONNX Runtime | CPU | **~1.5x** |

### Aegis Skill Integration

Detection runs as a **parallel pipeline** alongside VLM analysis — never blocks your AI agent:

```
Camera → Frame Governor → detect.py (JSONL) → Aegis IPC → Live Overlay
5 FPS ↓
perf_stats (p50/p95/p99 latency)
```

- 🖱️ **Click to setup** — one button in Aegis installs everything, no terminal needed
- 🤖 **AI-driven environment config** — autonomous agent detects your GPU, installs the right framework (CUDA/ROCm/CoreML/OpenVINO), converts models, and verifies the setup
- 📺 **Live bounding boxes** — detection results rendered as overlays on RTSP camera streams
- 📊 **Built-in performance profiling** — aggregate latency stats (p50/p95/p99) emitted every 50 frames
- ⚡ **Auto start** — set `auto_start: true` to begin detecting when Aegis launches

📖 [Full Skill Documentation →](skills/detection/yolo-detection-2026/SKILL.md)

## 📊 HomeSec-Bench — How Secure Is Your Local AI?

**HomeSec-Bench** is a 131-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM?
Expand Down
86 changes: 73 additions & 13 deletions skills/detection/yolo-detection-2026/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
---
name: yolo-detection-2026
description: "YOLO 2026 — state-of-the-art real-time object detection"
version: 1.0.0
version: 2.0.0
icon: assets/icon.png
entry: scripts/detect.py
deploy: deploy.sh

parameters:
- name: auto_start
label: "Auto Start"
type: boolean
default: false
description: "Start this skill automatically when Aegis launches"
group: Lifecycle

- name: model_size
label: "Model Size"
type: select
Expand Down Expand Up @@ -45,6 +53,13 @@ parameters:
description: "auto = best available GPU, else CPU"
group: Performance

- name: use_optimized
label: "Hardware Acceleration"
type: boolean
default: true
description: "Auto-convert model to optimized format for faster inference"
group: Performance

capabilities:
live_detection:
script: scripts/detect.py
Expand All @@ -64,6 +79,50 @@ Real-time object detection using the latest YOLO 2026 models. Detects 80+ COCO o
| medium | Moderate | High | Accuracy-focused deployments |
| large | Slower | Highest | Maximum detection quality |

## Hardware Acceleration

The skill uses [`env_config.py`](../../lib/env_config.py) to **automatically detect hardware** and convert the model to the fastest format for your platform. Conversion happens once during deployment and is cached.

| Platform | Backend | Optimized Format | Expected Speedup |
|----------|---------|------------------|:----------------:|
| NVIDIA GPU | CUDA | TensorRT `.engine` | ~3-5x |
| Apple Silicon (M1+) | MPS | CoreML `.mlpackage` | ~2x |
| Intel CPU/GPU/NPU | OpenVINO | OpenVINO IR `.xml` | ~2-3x |
| AMD GPU | ROCm | ONNX Runtime | ~1.5-2x |
| CPU (any) | CPU | ONNX Runtime | ~1.5x |

### How It Works

1. `deploy.sh` detects your hardware via `env_config.HardwareEnv.detect()`
2. Installs the matching `requirements_{backend}.txt` (e.g. CUDA → includes `tensorrt`)
3. Pre-converts the default model to the optimal format
4. At runtime, `detect.py` loads the cached optimized model automatically
5. Falls back to PyTorch if optimization fails

Set `use_optimized: false` to disable auto-conversion and use raw PyTorch.

## Auto Start

Set `auto_start: true` in the skill config to start detection automatically when Aegis launches. The skill will begin processing frames from the selected camera immediately.

```yaml
auto_start: true
model_size: nano
fps: 5
```

## Performance Monitoring

The skill emits `perf_stats` events every 50 frames with aggregate timing:

```jsonl
{"event": "perf_stats", "total_frames": 50, "timings_ms": {
"inference": {"avg": 3.4, "p50": 3.2, "p95": 5.1},
"postprocess": {"avg": 0.15, "p50": 0.12, "p95": 0.31},
"total": {"avg": 3.6, "p50": 3.4, "p95": 5.5}
}}
```

## Protocol

Communicates via **JSON lines** over stdin/stdout.
Expand All @@ -75,10 +134,11 @@ Communicates via **JSON lines** over stdin/stdout.

### Skill → Aegis (stdout)
```jsonl
{"event": "ready", "model": "yolo2026n", "device": "mps", "classes": 80, "fps": 5}
{"event": "ready", "model": "yolo2026n", "device": "mps", "backend": "mps", "format": "coreml", "gpu": "Apple M3", "classes": 80, "fps": 5}
{"event": "detections", "frame_id": 42, "camera_id": "front_door", "timestamp": "...", "objects": [
{"class": "person", "confidence": 0.92, "bbox": [100, 50, 300, 400]}
]}
{"event": "perf_stats", "total_frames": 50, "timings_ms": {"inference": {"avg": 3.4}}}
{"event": "error", "message": "...", "retriable": true}
```

Expand All @@ -90,20 +150,20 @@ Communicates via **JSON lines** over stdin/stdout.
{"command": "stop"}
```

## Hardware Support

| Platform | Backend | Performance |
|----------|---------|-------------|
| Apple Silicon (M1+) | MPS | 20-30 FPS |
| NVIDIA GPU | CUDA | 25-60 FPS |
| AMD GPU | ROCm | 15-40 FPS |
| CPU (modern x86) | CPU | 5-15 FPS |
| Raspberry Pi 5 | CPU | 2-5 FPS |

## Installation

The `deploy.sh` bootstrapper handles everything — Python environment, GPU backend detection, and dependency installation. No manual setup required.
The `deploy.sh` bootstrapper handles everything — Python environment, GPU backend detection, dependency installation, and model optimization. No manual setup required.

```bash
./deploy.sh
```

### Requirements Files

| File | Backend | Key Deps |
|------|---------|----------|
| `requirements_cuda.txt` | NVIDIA | `torch` (cu124), `tensorrt` |
| `requirements_mps.txt` | Apple | `torch`, `coremltools` |
| `requirements_intel.txt` | Intel | `torch`, `openvino` |
| `requirements_rocm.txt` | AMD | `torch` (rocm6.2), `onnxruntime-rocm` |
| `requirements_cpu.txt` | CPU | `torch` (cpu), `onnxruntime` |
8 changes: 8 additions & 0 deletions skills/detection/yolo-detection-2026/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,11 @@ params:
- { value: cuda, label: "NVIDIA CUDA" }
- { value: mps, label: "Apple Silicon (MPS)" }
- { value: rocm, label: "AMD ROCm" }

- key: use_optimized
label: Hardware Acceleration
type: boolean
default: true
description: "Auto-convert model to optimized format (TensorRT/CoreML/OpenVINO/ONNX) for faster inference"


Loading