MQTT Camera AI Monitor

Overview

The MQTT Camera AI Monitor is a TypeScript/Node.js application that monitors MQTT channels for trigger commands, captures high-quality images from RTSP cameras using FFmpeg, and processes them through OpenAI-compatible AI endpoints for analysis. The application supports both single and multi-image capture for advanced motion detection and temporal analysis.

Features

MQTT Integration: Monitors MQTT channels for trigger commands with automatic reconnection
RTSP Camera Support: Captures high-quality images from RTSP camera streams using FFmpeg
Multi-Image Capture: Sequential image capture with configurable intervals for motion analysis
AI Processing: Sends captured images to OpenAI-compatible endpoints for analysis
Binary Image Publishing: Publishes captured images as binary data to MQTT topics
Real-time Status Tracking: Live status updates during processing (e.g., "Taking snapshot 2/4")
Comprehensive Statistics: Detailed performance metrics and error tracking per camera
Status Monitoring: Publishes online/offline status via Last Will and Testament
Automatic Cleanup: Removes temporary image files to prevent disk space issues
Graceful Shutdown: Properly handles shutdown signals and cleanup
Docker Support: Runs in containerized environments with configurable file paths
Comprehensive Logging: Detailed logging with configurable levels for monitoring and debugging
Structured Output: Optional JSON schema-based responses for consistent data format

How it Works

MQTT Topic Structure

The application uses the following topic structure: <basetopic>/<cameraname>/<topicname>

Status Topics

<basetopic>/online - Application status ("YES" when running, "NO" when offline)

Camera Topics (for each configured camera)

<basetopic>/<camera>/trigger - Trigger topic (set to "YES" to start analysis, automatically resets to "NO")
<basetopic>/<camera>/image - Binary image data (JPEG format, retained) - First captured image
<basetopic>/<camera>/ai - AI analysis response (text or JSON, retained)
<basetopic>/<camera>/status - Current processing status (text, retained)
<basetopic>/<camera>/stats - Performance statistics and error tracking (JSON, retained)

Camera Status Values

The /status topic provides real-time updates during processing:

"Idle" - Camera ready for triggers
"Starting image capture" - Beginning capture process
"Taking snapshot" - Single image capture in progress
"Taking snapshot X/Y" - Multi-image capture progress (e.g., "Taking snapshot 2/4")
"Waiting for next capture (X/Y)" - Interval delay between captures
"Publishing image" - Uploading image to MQTT
"Processing with AI" - Sending to AI service for analysis
"Publishing AI response" - Uploading AI results to MQTT
"Cleaning up" - Removing temporary files
"Complete" - Successfully finished processing
"Error" - Error occurred during processing
"Offline" - Service shutting down

Camera Statistics

The /stats topic publishes a JSON object with performance metrics:

{
    "lastErrorDate": "2023-01-01T12:00:00.000Z",
    "lastErrorType": "Connection timeout",
    "lastSuccessDate": "2023-01-01T12:05:00.000Z",
    "lastAiProcessTime": 2.5,
    "lastTotalProcessTime": 8.2
}

Statistics Properties:

lastErrorDate - ISO timestamp of most recent error
lastErrorType - Description of the last error that occurred
lastSuccessDate - ISO timestamp of most recent successful processing
lastAiProcessTime - Time in seconds for AI processing only
lastTotalProcessTime - Total time in seconds from trigger to completion

Workflow

Initialization: Application connects to MQTT broker and initializes camera topics
Trigger: Set <basetopic>/<camera>/trigger to "YES" to start analysis
Status Updates: Real-time status published to <basetopic>/<camera>/status
Image Capture: Application captures one or multiple high-quality images from RTSP camera
Image Publishing: Binary data of the first captured image is published to <basetopic>/<camera>/image
AI Processing: All captured images are sent to AI endpoint with configured prompt
Result Publishing: AI response is published to <basetopic>/<camera>/ai
Statistics Update: Performance metrics published to <basetopic>/<camera>/stats
Reset: Trigger topic is automatically reset to "NO"
Cleanup: All temporary image files are automatically deleted

Multi-Image Capture

The application supports capturing multiple sequential images to provide AI with temporal context for motion detection and analysis:

Single Image Mode (default): Traditional single snapshot capture
Multi-Image Mode: Capture 2-10+ sequential images with configurable intervals
AI Context: Multiple images are sent in chronological order with enhanced prompts
Motion Analysis: AI can detect movement, direction, and changes across the sequence
Progress Tracking: Status updates show capture progress (e.g., "Taking snapshot 3/5")

Installation & Setup

Prerequisites

Node.js 18+ (for bare metal installation)
FFmpeg (for camera image capture)
MQTT broker
OpenAI-compatible API endpoint with vision support

Camera Configuration Options

Create a config.yaml file based on the provided config.yaml.sample.

Basic Settings

endpoint (required): RTSP stream URL with authentication
prompt (required): Text prompt for AI analysis

Multi-Image Settings

captures (optional): Number of images to capture (default: 1, range: 1-10+)
interval (optional): Milliseconds between captures (default: 1000, minimum: 0)

Advanced Settings

output (optional): Structured output schema for consistent JSON responses

Configuration Examples

Minimal Single Image

simple_camera:
  endpoint: rtsp://user:pass@camera/stream
  prompt: "What do you see?"

Motion Detection with Multiple Images

motion_camera:
  endpoint: rtsp://user:pass@camera/stream
  captures: 4
  interval: 2000
  prompt: "Detect and analyze movement across these 4 sequential images."

Structured Output with Multiple Images

security_camera:
  endpoint: rtsp://user:pass@camera/stream
  captures: 3
  interval: 5000
  prompt: "Analyze security footage for people and vehicle activity."
  output:
    PeopleDetected:
      type: string
      enum: ["Yes", "No", "Unknown"]
    VehicleMovement:
      type: string
      enum: ["Entering", "Leaving", "None", "Unknown"]

Deployment Options

Bare Metal

Install dependencies:
```
npm install
```

Place your config file:

cp config.yaml.sample config.yaml
# Edit config.yaml with your settings

Run the application:

npm start

# With debug logging
LOG_LEVEL=debug npm start

Docker

Using Pre-built Image

docker run -d \
  --name mqtt-camera-monitor \
  -v /path/to/your/config.yaml:/usr/src/app/config.yaml \
  kosdk/mqtt-camera-ai-monitor:latest

Using Custom Config Location (Docker)

docker run -d \
  --name mqtt-camera-monitor \
  -e CONFIG_FILE=/app/config/custom-config.yaml \
  -e LOG_LEVEL=debug \
  -v /path/to/your/config.yaml:/app/config/custom-config.yaml \
  kosdk/mqtt-camera-ai-monitor:latest

Docker Compose Example

version: '3.8'
services:
  mqtt-camera-monitor:
    image: kosdk/mqtt-camera-ai-monitor:latest
    container_name: mqtt-camera-monitor
    environment:
      - CONFIG_FILE=/app/config/config.yaml
      - LOG_LEVEL=info
    volumes:
      - ./config.yaml:/app/config/config.yaml
    restart: unless-stopped

Usage Examples

Basic Single Image Analysis

# Trigger analysis
mosquitto_pub -h mqtt-server -t "mqttcaim/garage/trigger" -m "YES"

# Monitor results and status
mosquitto_sub -h mqtt-server -t "mqttcaim/garage/#"

Multi-Image Motion Detection

# Trigger 5-image sequence analysis
mosquitto_pub -h mqtt-server -t "mqttcaim/driveway/trigger" -m "YES"

# The system will:
# 1. Capture 5 images over 15 seconds (3s intervals)
# 2. Send all images to AI for motion analysis
# 3. Publish results to mqttcaim/driveway/ai

Monitoring Camera Status and Statistics

# Watch real-time status updates
mosquitto_sub -h mqtt-server -t "mqttcaim/driveway/status"

# Monitor performance statistics
mosquitto_sub -h mqtt-server -t "mqttcaim/driveway/stats"

# Check if application is online
mosquitto_sub -h mqtt-server -t "mqttcaim/online"

# Monitor all activity for a camera
mosquitto_sub -h mqtt-server -t "mqttcaim/driveway/#"

# Monitor everything
mosquitto_sub -h mqtt-server -t "mqttcaim/#"

Multi-Image Capture Benefits

Motion Detection

Direction Analysis: Detect people/vehicles entering or leaving
Speed Estimation: Understand movement speed across frames
Path Tracking: Follow object movement through the scene

Context Understanding

Activity Patterns: Understand what happened over time
Change Detection: Identify what changed between frames
Event Sequencing: Understand the order of events

Use Cases

Security Monitoring: Detect intrusions with movement context
Traffic Analysis: Monitor vehicle flow and parking changes
Wildlife Observation: Track animal behavior over time
Package Delivery: Detect delivery events with full context

Monitoring & Troubleshooting

Environment Variables

LOG_LEVEL: Controls logging verbosity
- error: Only errors
- warn: Warnings and errors
- info: Info, warnings, and errors (default)
- debug: Debug, info, warnings, and errors
- verbose: Very detailed logging
- silly: Everything
CONFIG_FILE: Custom config file path (Docker only)

Log Files

Console output: Real-time application logs
error.log: Error-level logs only
combined.log: All log levels

Performance Monitoring

Use the /stats topic to monitor camera performance:

# Get current stats for a camera
mosquitto_sub -h mqtt-server -t "mqttcaim/camera1/stats" -C 1

# Example output:
{
  "lastSuccessDate": "2023-10-15T14:30:25.123Z",
  "lastAiProcessTime": 3.2,
  "lastTotalProcessTime": 12.8
}

Common Issues

MQTT Connection: Check server address, credentials, and network connectivity
Camera Access: Verify RTSP URLs and camera credentials
AI API: Confirm endpoint URL, API token, and model supports vision
Multi-Image Timeouts: Increase AI timeout for multiple image processing
Disk Space: Ensure adequate space for temporary image files
Memory Usage: Multiple large images may require more RAM

Performance Considerations

Capture Duration: Total time = (captures - 1) × interval
AI Processing Time: Increases with number of images
Network Bandwidth: Multiple images require more upload bandwidth
Storage: Temporary files are automatically cleaned up

Configuration Reference

MQTT Settings

server: MQTT broker hostname/IP
port: MQTT broker port (typically 1883)
basetopic: Base topic for all MQTT communications
user/password: MQTT authentication credentials
client: MQTT client identifier

OpenAI Settings

endpoint: AI API endpoint URL
api_token: API authentication token
model: AI model name (must support vision/image input)

Camera Settings

endpoint: RTSP stream URL with credentials
prompt: Text prompt for AI analysis
captures: Number of sequential images (1-10+, default: 1)
interval: Milliseconds between captures (≥0, default: 1000)
output: Optional structured output schema

Refer to Structured model outputs - OpenAI API for more information about the output schema.

Multi-Image Guidelines

Optimal Captures: 3-5 images for most motion detection scenarios
Interval Timing: 1-5 seconds depending on expected motion speed
Total Duration: Keep under 30 seconds to avoid timeout issues
Prompt Design: Include context about sequential analysis in prompts

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes with appropriate tests
Submit a pull request

License

This project is licensed under the BSD 3-Clause License. See the LICENSE file for details.

Support

For issues and questions:

Check the application logs for error details
Monitor camera /status and /stats topics for real-time information
Verify configuration settings
Ensure network connectivity to MQTT broker and AI endpoint
Test with single image before using multi-image capture
Monitor disk space and memory usage for multi-image scenarios
Open an issue on the project repository

AI Notice

Vibe-coded in Claude Sonnet 4. No cap.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
src		src
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.prettierrc.json		.prettierrc.json
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.yaml.sample		config.yaml.sample
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

License

Cossey/mqtt-camera-ai-monitor

Folders and files

Latest commit

History

Repository files navigation

MQTT Camera AI Monitor

Overview

Features

How it Works

MQTT Topic Structure

Status Topics

Camera Topics (for each configured camera)

Camera Status Values

Camera Statistics

Workflow

Multi-Image Capture

Installation & Setup

Prerequisites

Camera Configuration Options

Basic Settings

Multi-Image Settings

Advanced Settings

Configuration Examples

Minimal Single Image

Motion Detection with Multiple Images

Structured Output with Multiple Images

Deployment Options

Bare Metal

Docker

Using Pre-built Image

Using Custom Config Location (Docker)

Docker Compose Example

Usage Examples

Basic Single Image Analysis

Multi-Image Motion Detection

Monitoring Camera Status and Statistics

Multi-Image Capture Benefits

Motion Detection

Context Understanding

Use Cases

Monitoring & Troubleshooting

Environment Variables

Log Files

Performance Monitoring

Common Issues

Performance Considerations

Configuration Reference

MQTT Settings

OpenAI Settings

Camera Settings

Multi-Image Guidelines

Contributing

License

Support

AI Notice

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Languages