Skip to content

AR-powered equipment identification using computer vision and Gemini

Notifications You must be signed in to change notification settings

derekparent/ar-id

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reality Layer

AR-powered object detection and contextual understanding for physical systems.

Reality Layer transforms your iPhone camera into an intelligent analyzer. Point at a breaker panel, HVAC system, or other equipment and get real-time AR overlays identifying components, safety warnings, and diagnostic guidance—powered by Google's Gemini Pro Vision.

Quick Start (< 10 minutes)

Prerequisites

  • Mac with Xcode 15+ installed
  • iPhone with ARKit support (iPhone 6s or newer, iOS 15+)
  • Python 3.11+ installed
  • USB cable to connect iPhone to Mac

5-Step Setup

# 1. Clone the repository
git clone https://github.com/Dparent97/Reality-layer.git
cd Reality-layer

# 2. Start backend (creates venv, installs deps, runs in mock mode)
./backend/run.sh

# 3. Get your Mac's IP address (note this for step 5)
ipconfig getifaddr en0
# 4. Open iOS project in Xcode
open RealityLayer.xcodeproj
# 5. In Xcode:
#    - Select your connected iPhone as the build target
#    - Update Config.swift: change localhost to your Mac's IP (from step 3)
#    - Press Cmd+R to build and run

First Run Experience

  1. Grant Permissions: Tap "Allow" for camera and AR access
  2. Point Camera: Aim at any object (breaker panel works great for demo)
  3. Tap "Analyze Scene": Wait 1-2 seconds for mock response
  4. See AR Labels: 3D text labels appear anchored in space
  5. Clear & Repeat: Tap the X button to clear and try again

Verify It's Working

Backend health check:

curl http://localhost:8000/health
# Expected: {"status":"healthy","version":"0.1.0","environment":"development"}

Test analysis endpoint:

curl -X POST http://localhost:8000/analyze \
  -F "file=@/path/to/any/image.jpg" \
  -F "mock=true"

What It Does

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   iPhone    │────▶│   Backend   │────▶│   Gemini    │
│  AR Camera  │◀────│   FastAPI   │◀────│  Pro Vision │
└─────────────┘     └─────────────┘     └─────────────┘
     │                    │                    │
     │  Camera frame      │  Image + prompt    │  JSON response
     │  as JPEG           │                    │  with objects
     ▼                    ▼                    ▼
┌─────────────────────────────────────────────────────┐
│  AR Overlays: 3D labels anchored to detected       │
│  objects with safety warnings and descriptions     │
└─────────────────────────────────────────────────────┘

Use Cases:

  • Electrical panel identification and safety warnings
  • HVAC component labeling
  • Industrial equipment documentation
  • Educational overlays for physical systems

Architecture Overview

Component Technology Purpose
iOS Client Swift, SwiftUI, ARKit, RealityKit Camera capture, AR rendering, user interface
Backend API Python, FastAPI, Pydantic Image processing, API gateway, mock mode
AI Service Google Vertex AI (Gemini Pro Vision) Multimodal image analysis and object detection

See docs/ARCHITECTURE.md for detailed system design.

Development Setup

Backend Development

cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run in development mode (mock enabled)
ENVIRONMENT=development DEBUG=true uvicorn main:app --reload

# Run tests
pytest -v

Environment Variables:

Variable Default Description
GCP_PROJECT_ID (required for real AI) Google Cloud Project ID
ENVIRONMENT development development, staging, production
DEBUG false Enable debug mode and API docs

iOS Development

  1. Open RealityLayer.xcodeproj in Xcode
  2. Select your development team (Signing & Capabilities)
  3. Update ios/Config.swift with backend URL for device testing
  4. Build to physical device (Simulator doesn't support ARKit)

See ios/README.md for detailed iOS setup.

API Reference

POST /analyze

Analyze an image frame and return detected objects with AR positioning.

Request:

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: multipart/form-data" \
  -F "file=@image.jpg" \
  -F "mock=true"  # Optional: use mock mode

Response:

{
  "session_id": "mock-abc12345",
  "detected_objects": [
    {
      "id": "obj_1",
      "label": "Breaker Panel",
      "type": "label",
      "position_2d": [0.5, 0.5],
      "description": "Main distribution panel. Ensure main breaker is OFF before servicing."
    },
    {
      "id": "obj_2",
      "label": "Main Breaker (200A)",
      "type": "warning",
      "position_2d": [0.5, 0.2],
      "description": "High Voltage! Do not touch terminals."
    }
  ],
  "system_context": "Electrical Distribution System",
  "processing_time_ms": 1023.45
}

GET /health

Health check endpoint.

curl http://localhost:8000/health

GET /docs

Interactive API documentation (available when DEBUG=true).

Demo Guide

See docs/DEMO_SCRIPT.md for:

  • What to demonstrate
  • Talking points
  • Expected behavior
  • Known limitations

Project Structure

Reality-layer/
├── backend/
│   ├── main.py              # FastAPI application
│   ├── config.py            # Environment configuration
│   ├── gemini_service.py    # Gemini Vision integration
│   ├── run.sh               # One-command startup script
│   └── requirements.txt     # Python dependencies
├── ios/
│   ├── ContentView.swift    # Main UI
│   ├── ARViewContainer.swift# AR rendering
│   ├── NetworkManager.swift # API client
│   ├── Config.swift         # iOS configuration
│   └── README.md            # iOS-specific docs
├── docs/
│   ├── ARCHITECTURE.md      # System design
│   └── DEMO_SCRIPT.md       # Demo guide
└── README.md                # This file

Troubleshooting

Backend won't start

# Check Python version
python --version  # Should be 3.11+

# Try manual setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload

iOS app can't connect to backend

  1. Ensure backend is running: curl http://localhost:8000/health
  2. Get Mac's IP: ipconfig getifaddr en0
  3. Update ios/Config.swift to use IP instead of localhost
  4. Ensure Mac and iPhone are on same Wi-Fi network
  5. Check Mac firewall allows incoming connections on port 8000

AR features not working

  • ARKit requires a physical device (not Simulator)
  • Ensure camera permissions are granted
  • iPhone must have A12 chip or newer
  • Good lighting helps AR tracking

"Signing for Reality Layer requires a development team"

  1. Select project in Xcode
  2. Go to Signing & Capabilities tab
  3. Select your Apple Developer account under "Team"
  4. If no team, add Apple ID in Xcode → Settings → Accounts

Roadmap

  • Real-time streaming analysis
  • Offline caching of common patterns
  • Custom prompt domains (industrial, automotive, etc.)
  • Multi-user session sharing
  • Voice-guided instructions

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

License

MIT License - see LICENSE file for details.


Need help? Check the docs/ folder or open an issue on GitHub.

About

AR-powered equipment identification using computer vision and Gemini

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published