An iOS application that uses ARKit and Core ML to recognize famous landmarks in real-time and overlay contextual videos when landmarks are detected through the device camera.
This application combines computer vision, augmented reality, and machine learning to create an interactive landmark discovery experience. When users point their device camera at famous landmarks, the app automatically recognizes them and displays relevant video content as an overlay.
- Smart Detection System: Uses a custom Core ML model (
Landmark_Classifier) for accurate landmark identification - Consecutive Validation: Requires 3 consecutive detections to minimize false positives
- Dynamic Confidence Thresholds: Adaptive confidence levels (85% for new detections, 75% during video playback)
- Ambiguity Prevention: Validates confidence gaps between top predictions to avoid uncertain results
- Seamless Overlay: Videos play as full-screen overlays using
AVPlayerLayer - Smart State Management: Prevents unnecessary video restarts for the same landmark
- Auto-switching: Intelligently switches videos when different landmarks are detected
- Graceful Stopping: Stops video after 2 consecutive non-detections to prevent interruptions
- Concurrent Processing Protection: Prevents frame processing conflicts with threading guards
- Optimized Classification Timing: 0.8-second intervals for responsive detection
- Memory Management: Proper cleanup of video players, timers, and notification observers
- AR Session Management: Efficient ARKit configuration with horizontal plane detection
The application currently recognizes these famous landmarks:
- Taj Mahal 🇮🇳
- Colosseum 🇮🇹
- Eiffel Tower 🇫🇷
- Statue of Liberty 🇺🇸
- Golden Gate Bridge 🇺🇸
- Leaning Tower of Pisa 🇮🇹
- ARKit: Real-time camera feed and AR session management
- Core ML: Machine learning model inference for landmark classification
- Vision Framework: Image processing and ML request handling
- AVKit: Video playback and overlay management
- SceneKit: 3D scene rendering (future expansion capability)
- Frame Capture: ARSCNView captures camera frames at 0.8s intervals
- ML Processing: Core ML model processes frames for landmark classification
- Validation: Multiple validation layers ensure accuracy:
- Confidence threshold validation
- Consecutive detection confirmation
- Known landmark verification
- Ambiguity resolution
- Video Trigger: Qualified detections trigger appropriate video overlays
- iOS 17.6+ (ARKit requirement)
- Device with A14 processor or newer (ARKit compatibility)
- Camera access permissions
- Internet connection (for streaming video content)
-
Clone the repository
-
Open in Xcode
-
Add the ML Model
- Place your
Landmark_Classifier.mlmodelfile in the project - Ensure it's added to the target
- Configure Permissions
- Camera usage permission is required in
Info.plist
- Build and Run
- Select a physical iOS device (ARKit requires physical device)
- Build and run the project
- Launch the App: Open the application on your iOS device
- Point Camera: Aim your device camera at a supported landmark
- Wait for Recognition: The app will analyze the scene automatically
- Enjoy Video: Once detected, a contextual video will overlay your view
- Explore More: Move to different landmarks for new content
- Expanded Landmark Database: Add more landmarks and cultural sites
- 3D AR Models: Integrate 3D models alongside video content
- Audio Narration: Add voice-over descriptions for landmarks
- Offline Mode: Support for offline landmark recognition
- User-Generated Content: Allow users to add custom landmarks and videos
- Social Features: Share discoveries with friends
- Travel Integration: Connect with travel planning apps
We welcome contributions! Areas where you can help:
- Adding new landmark recognition models
- Improving detection accuracy
- Adding new video content
- UI/UX enhancements
- Performance optimizations
- Documentation improvements
This project is licensed under the MIT License - see the LICENSE file for details.
- Video content sourced from Pixabay
- ARKit and Core ML frameworks by Apple
- Open source community for inspiration and resources
Note: This application is designed for educational and demonstration purposes. Ensure you have proper permissions for any video content used in production applications.