The MQTT Camera AI Monitor is a TypeScript/Node.js application that monitors MQTT channels for trigger commands, captures high-quality images from RTSP cameras using FFmpeg, and processes them through OpenAI-compatible AI endpoints for analysis. The application supports both single and multi-image capture for advanced motion detection and temporal analysis.
- MQTT Integration: Monitors MQTT channels for trigger commands with automatic reconnection
- RTSP Camera Support: Captures high-quality images from RTSP camera streams using FFmpeg
- Multi-Image Capture: Sequential image capture with configurable intervals for motion analysis
- AI Processing: Sends captured images to OpenAI-compatible endpoints for analysis
- Binary Image Publishing: Publishes captured images as binary data to MQTT topics
- Real-time Status Tracking: Live status updates during processing (e.g., "Taking snapshot 2/4")
- Comprehensive Statistics: Detailed performance metrics and error tracking per camera
- Status Monitoring: Publishes online/offline status via Last Will and Testament
- Automatic Cleanup: Removes temporary image files to prevent disk space issues
- Graceful Shutdown: Properly handles shutdown signals and cleanup
- Docker Support: Runs in containerized environments with configurable file paths
- Comprehensive Logging: Detailed logging with configurable levels for monitoring and debugging
- Structured Output: Optional JSON schema-based responses for consistent data format
The application uses the following topic structure: <basetopic>/<cameraname>/<topicname>
<basetopic>/online- Application status ("YES" when running, "NO" when offline)
<basetopic>/<camera>/trigger- Trigger topic (set to "YES" to start analysis, automatically resets to "NO")<basetopic>/<camera>/image- Binary image data (JPEG format, retained) - First captured image<basetopic>/<camera>/ai- AI analysis response (text or JSON, retained)<basetopic>/<camera>/status- Current processing status (text, retained)<basetopic>/<camera>/stats- Performance statistics and error tracking (JSON, retained)
The /status topic provides real-time updates during processing:
"Idle"- Camera ready for triggers"Starting image capture"- Beginning capture process"Taking snapshot"- Single image capture in progress"Taking snapshot X/Y"- Multi-image capture progress (e.g., "Taking snapshot 2/4")"Waiting for next capture (X/Y)"- Interval delay between captures"Publishing image"- Uploading image to MQTT"Processing with AI"- Sending to AI service for analysis"Publishing AI response"- Uploading AI results to MQTT"Cleaning up"- Removing temporary files"Complete"- Successfully finished processing"Error"- Error occurred during processing"Offline"- Service shutting down
The /stats topic publishes a JSON object with performance metrics:
{
"lastErrorDate": "2023-01-01T12:00:00.000Z",
"lastErrorType": "Connection timeout",
"lastSuccessDate": "2023-01-01T12:05:00.000Z",
"lastAiProcessTime": 2.5,
"lastTotalProcessTime": 8.2
}Statistics Properties:
lastErrorDate- ISO timestamp of most recent errorlastErrorType- Description of the last error that occurredlastSuccessDate- ISO timestamp of most recent successful processinglastAiProcessTime- Time in seconds for AI processing onlylastTotalProcessTime- Total time in seconds from trigger to completion
- Initialization: Application connects to MQTT broker and initializes camera topics
- Trigger: Set
<basetopic>/<camera>/triggerto "YES" to start analysis - Status Updates: Real-time status published to
<basetopic>/<camera>/status - Image Capture: Application captures one or multiple high-quality images from RTSP camera
- Image Publishing: Binary data of the first captured image is published to
<basetopic>/<camera>/image - AI Processing: All captured images are sent to AI endpoint with configured prompt
- Result Publishing: AI response is published to
<basetopic>/<camera>/ai - Statistics Update: Performance metrics published to
<basetopic>/<camera>/stats - Reset: Trigger topic is automatically reset to "NO"
- Cleanup: All temporary image files are automatically deleted
The application supports capturing multiple sequential images to provide AI with temporal context for motion detection and analysis:
- Single Image Mode (default): Traditional single snapshot capture
- Multi-Image Mode: Capture 2-10+ sequential images with configurable intervals
- AI Context: Multiple images are sent in chronological order with enhanced prompts
- Motion Analysis: AI can detect movement, direction, and changes across the sequence
- Progress Tracking: Status updates show capture progress (e.g., "Taking snapshot 3/5")
- Node.js 18+ (for bare metal installation)
- FFmpeg (for camera image capture)
- MQTT broker
- OpenAI-compatible API endpoint with vision support
Create a config.yaml file based on the provided config.yaml.sample.
endpoint(required): RTSP stream URL with authenticationprompt(required): Text prompt for AI analysis
captures(optional): Number of images to capture (default: 1, range: 1-10+)interval(optional): Milliseconds between captures (default: 1000, minimum: 0)
output(optional): Structured output schema for consistent JSON responses
simple_camera:
endpoint: rtsp://user:pass@camera/stream
prompt: "What do you see?"motion_camera:
endpoint: rtsp://user:pass@camera/stream
captures: 4
interval: 2000
prompt: "Detect and analyze movement across these 4 sequential images."security_camera:
endpoint: rtsp://user:pass@camera/stream
captures: 3
interval: 5000
prompt: "Analyze security footage for people and vehicle activity."
output:
PeopleDetected:
type: string
enum: ["Yes", "No", "Unknown"]
VehicleMovement:
type: string
enum: ["Entering", "Leaving", "None", "Unknown"]-
Install dependencies:
npm install
-
Place your config file:
cp config.yaml.sample config.yaml # Edit config.yaml with your settings -
Run the application:
npm start # With debug logging LOG_LEVEL=debug npm start
docker run -d \
--name mqtt-camera-monitor \
-v /path/to/your/config.yaml:/usr/src/app/config.yaml \
kosdk/mqtt-camera-ai-monitor:latestdocker run -d \
--name mqtt-camera-monitor \
-e CONFIG_FILE=/app/config/custom-config.yaml \
-e LOG_LEVEL=debug \
-v /path/to/your/config.yaml:/app/config/custom-config.yaml \
kosdk/mqtt-camera-ai-monitor:latestversion: '3.8'
services:
mqtt-camera-monitor:
image: kosdk/mqtt-camera-ai-monitor:latest
container_name: mqtt-camera-monitor
environment:
- CONFIG_FILE=/app/config/config.yaml
- LOG_LEVEL=info
volumes:
- ./config.yaml:/app/config/config.yaml
restart: unless-stopped# Trigger analysis
mosquitto_pub -h mqtt-server -t "mqttcaim/garage/trigger" -m "YES"
# Monitor results and status
mosquitto_sub -h mqtt-server -t "mqttcaim/garage/#"# Trigger 5-image sequence analysis
mosquitto_pub -h mqtt-server -t "mqttcaim/driveway/trigger" -m "YES"
# The system will:
# 1. Capture 5 images over 15 seconds (3s intervals)
# 2. Send all images to AI for motion analysis
# 3. Publish results to mqttcaim/driveway/ai# Watch real-time status updates
mosquitto_sub -h mqtt-server -t "mqttcaim/driveway/status"
# Monitor performance statistics
mosquitto_sub -h mqtt-server -t "mqttcaim/driveway/stats"
# Check if application is online
mosquitto_sub -h mqtt-server -t "mqttcaim/online"
# Monitor all activity for a camera
mosquitto_sub -h mqtt-server -t "mqttcaim/driveway/#"
# Monitor everything
mosquitto_sub -h mqtt-server -t "mqttcaim/#"- Direction Analysis: Detect people/vehicles entering or leaving
- Speed Estimation: Understand movement speed across frames
- Path Tracking: Follow object movement through the scene
- Activity Patterns: Understand what happened over time
- Change Detection: Identify what changed between frames
- Event Sequencing: Understand the order of events
- Security Monitoring: Detect intrusions with movement context
- Traffic Analysis: Monitor vehicle flow and parking changes
- Wildlife Observation: Track animal behavior over time
- Package Delivery: Detect delivery events with full context
LOG_LEVEL: Controls logging verbosityerror: Only errorswarn: Warnings and errorsinfo: Info, warnings, and errors (default)debug: Debug, info, warnings, and errorsverbose: Very detailed loggingsilly: Everything
CONFIG_FILE: Custom config file path (Docker only)
- Console output: Real-time application logs
- error.log: Error-level logs only
- combined.log: All log levels
Use the /stats topic to monitor camera performance:
# Get current stats for a camera
mosquitto_sub -h mqtt-server -t "mqttcaim/camera1/stats" -C 1
# Example output:
{
"lastSuccessDate": "2023-10-15T14:30:25.123Z",
"lastAiProcessTime": 3.2,
"lastTotalProcessTime": 12.8
}- MQTT Connection: Check server address, credentials, and network connectivity
- Camera Access: Verify RTSP URLs and camera credentials
- AI API: Confirm endpoint URL, API token, and model supports vision
- Multi-Image Timeouts: Increase AI timeout for multiple image processing
- Disk Space: Ensure adequate space for temporary image files
- Memory Usage: Multiple large images may require more RAM
- Capture Duration: Total time = (captures - 1) × interval
- AI Processing Time: Increases with number of images
- Network Bandwidth: Multiple images require more upload bandwidth
- Storage: Temporary files are automatically cleaned up
server: MQTT broker hostname/IPport: MQTT broker port (typically 1883)basetopic: Base topic for all MQTT communicationsuser/password: MQTT authentication credentialsclient: MQTT client identifier
endpoint: AI API endpoint URLapi_token: API authentication tokenmodel: AI model name (must support vision/image input)
endpoint: RTSP stream URL with credentialsprompt: Text prompt for AI analysiscaptures: Number of sequential images (1-10+, default: 1)interval: Milliseconds between captures (≥0, default: 1000)output: Optional structured output schema
Refer to Structured model outputs - OpenAI API for more information about the output schema.
- Optimal Captures: 3-5 images for most motion detection scenarios
- Interval Timing: 1-5 seconds depending on expected motion speed
- Total Duration: Keep under 30 seconds to avoid timeout issues
- Prompt Design: Include context about sequential analysis in prompts
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes with appropriate tests
- Submit a pull request
This project is licensed under the BSD 3-Clause License. See the LICENSE file for details.
For issues and questions:
- Check the application logs for error details
- Monitor camera
/statusand/statstopics for real-time information - Verify configuration settings
- Ensure network connectivity to MQTT broker and AI endpoint
- Test with single image before using multi-image capture
- Monitor disk space and memory usage for multi-image scenarios
- Open an issue on the project repository
Vibe-coded in Claude Sonnet 4. No cap.