Spatial Photo

A visionOS app that transforms 2D photos into immersive 3D spatial experiences using AI-powered depth estimation.

Features

AI Depth Estimation: Uses Apple's Depth Anything V2 CoreML model to infer depth from any 2D photo
3D Displacement Mesh: Converts depth maps into real 3D geometry with smooth normal calculation
Immersive Display: View your spatial photos in a mixed reality immersive space on Vision Pro
File Picker: Load photos from any folder including Downloads
Sample Photo: Built-in test image to try the feature instantly
Simulator Support: CPU-only inference mode for visionOS Simulator development

Requirements

Xcode 17+
visionOS 26+ SDK
Apple Vision Pro or visionOS Simulator

Getting Started

Open spatial-photo.xcodeproj in Xcode
Select the visionOS Simulator (Apple Vision Pro)
Build and run (⌘R)
Tap "Try Sample Photo" or "Select Photo" to load an image
Wait for depth processing (~1 second on device, longer on simulator)
Tap "View in Space" to see your 3D spatial photo

How It Works

Processing Pipeline

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Select    │────▶│    Load     │────▶│   Depth     │────▶│  Generate   │
│   Photo     │     │   Image     │     │  Inference  │     │    Mesh     │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
                                                                   │
                                                                   ▼
                                                            ┌─────────────┐
                                                            │  Display in │
                                                            │  Immersive  │
                                                            │    Space    │
                                                            └─────────────┘

Image Loading: The ImageLoader service loads images with security-scoped resource access for file picker selections
Depth Inference: The DepthProcessor runs the Depth Anything V2 Small model via CoreML's Vision framework
Mesh Generation: The SpatialPhotoMeshGenerator creates a displacement mesh with bilinear depth sampling
Immersive Display: RealityKit renders the textured mesh in a mixed reality immersive space

Depth Model

This app uses Depth Anything V2 Small from Apple's CoreML model collection:

Architecture: DPT-based with DINOv2 encoder
Input Size: 518 × 392 pixels
Model Size: ~50MB (Float16)
Inference: Runs on Neural Engine for optimal performance (CPU on simulator)

Output Handling

The depth processor handles multiple Vision framework observation types:

VNPixelBufferObservation: Common for depth models outputting image-to-image results
VNCoreMLFeatureValueObservation: For models outputting MLMultiArray data

Supported pixel buffer formats:

OneComponent32Float - 32-bit floating point depth
OneComponent16Half - 16-bit half precision
DepthFloat32 - Dedicated depth format
OneComponent8 - 8-bit grayscale

Architecture

SpatialPhotoViewModel (@Observable, @MainActor)
       │
       ├── ImageLoader (actor) ──────────► CGImage
       │
       ├── DepthProcessor (actor) ───────► Depth map via CoreML
       │
       └── SpatialPhotoMeshGenerator ────► MeshResource with displacement

Key Components

Component	Description
`SpatialPhotoViewModel`	Observable state management for the processing pipeline
`ImageLoader`	Thread-safe image loading with security-scoped resource access
`DepthProcessor`	CoreML depth inference with Vision framework integration
`SpatialPhotoMeshGenerator`	Creates displacement meshes from depth data

Project Structure

spatial-photo/
├── spatial-photo/
│   ├── Models/              # Data models + CoreML model
│   ├── ViewModels/          # Observable state management
│   ├── Views/               # SwiftUI views
│   ├── Services/            # ImageLoader, DepthProcessor
│   ├── Rendering/           # Mesh generation
│   └── Extensions/          # CGImage, MLMultiArray helpers
├── spatial-photoTests/      # Unit tests
└── Packages/
    └── RealityKitContent/   # RealityKit assets

Testing

Run the test suite:

xcodebuild test -project spatial-photo.xcodeproj \
  -scheme spatial-photo \
  -destination 'platform=visionOS Simulator,name=Apple Vision Pro'

Test Coverage

Test Suite	Tests	Description
`SpatialPhotoDataTests`	2	Processing state validation
`MLMultiArrayExtensionTests`	4	Depth data conversion from MLMultiArray
`SpatialPhotoMeshGeneratorTests`	2	Mesh generator initialization
`CGImageExtensionTests`	2	Image to CVPixelBuffer conversion
`ImageLoaderTests`	3	Image loading and format support
`DepthProcessorTests`	3	Model loading and depth inference
`SpatialPhotoViewModelTests`	3	ViewModel state management

Note: Tests that require RealityKit mesh generation have been removed as they are not supported in the visionOS Simulator.

Simulator vs Device

Feature	Simulator	Device
Compute Units	CPU only	Neural Engine + GPU
Inference Speed	Slower	~1 second
Mesh Rendering	Limited	Full support
Immersive Space	Basic	Full mixed reality

The app automatically detects the environment and configures CoreML appropriately:

#if targetEnvironment(simulator)
config.computeUnits = .cpuOnly
#else
config.computeUnits = .all
#endif

Acknowledgments

Depth Anything V2 - CoreML model by Apple
Depth Anything - Original research by Lihe Yang et al.

License

MIT License

Version History

v1.0 - Initial release with depth estimation and immersive display

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Packages/RealityKitContent		Packages/RealityKitContent
spatial-photo.xcodeproj		spatial-photo.xcodeproj
spatial-photo		spatial-photo
spatial-photoTests		spatial-photoTests
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spatial Photo

Features

Requirements

Getting Started

How It Works

Processing Pipeline

Depth Model

Output Handling

Architecture

Key Components

Project Structure

Testing

Test Coverage

Simulator vs Device

Acknowledgments

License

Version History

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

drerrolb/spatial-photo

Folders and files

Latest commit

History

Repository files navigation

Spatial Photo

Features

Requirements

Getting Started

How It Works

Processing Pipeline

Depth Model

Output Handling

Architecture

Key Components

Project Structure

Testing

Test Coverage

Simulator vs Device

Acknowledgments

License

Version History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages