This project performs depth estimation and 3D point cloud generation from images using deep learning. It leverages:
- Streamlit for interactive UI
- OpenCV for image processing
- Hugging Face Transformers (DPT Model) for depth estimation
- Plotly for 3D point cloud visualization
- Upload Image & Capture Image from Webcam
- Generate Depth Map using Intel's DPT-Large Model
- Convert Depth Map to 3D Point Cloud
- Interactive 3D Visualization with Plotly
.
├── app.py # Streamlit-based UI for depth estimation & 3D visualization
├── capture.py # Webcam image capture script
├── depth_model.py # Depth estimation using DPT-Large model
├── point_cloud.py # Converts depth map to 3D points & visualizes it
├── images/ # Stores uploaded or captured images
├── output/ # Stores generated depth maps
└── README.md # Documentation
Ensure you have Python installed (preferably 3.8+). Install required libraries:
pip install streamlit opencv-python numpy torch transformers plotlyFor GPU acceleration, install torch with CUDA support:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118- Allows users to upload an image
- Calls
estimate_depth()to generate a depth map - Converts depth to 3D point cloud
- Displays an interactive 3D visualization
- File Handling: Saves uploaded images to
images/ - Depth Estimation: Calls
estimate_depth()fromdepth_model.py - 3D Point Cloud Conversion: Calls
depth_to_3d()andvisualize_3d_point_cloud()
- Uses OpenCV to capture an image from the webcam
- Saves the image in
images/
- Press
SPACEto capture an image - Press
ESCto exit
- Uses Intel's DPT-Large Model for depth estimation
- Converts an image into a grayscale depth map
- Saves the depth map in
output/
- estimate_depth(image_path):
- Loads image with OpenCV
- Converts to RGB and processes using DPT-Large Model
- Normalizes depth values
- Saves as a colormap (
output/depth_map.jpg)
- Converts depth map to 3D coordinates
- Uses Plotly for interactive 3D visualization
- depth_to_3d(depth_map, focal_length=500):
- Converts pixel depth values to real-world 3D points
- Uses focal length to reconstruct depth
- visualize_3d_point_cloud(x, y, z):
- Uses Plotly to render an interactive scatter plot
streamlit run app.pypython capture.pyfrom depth_model import estimate_depth
from point_cloud import depth_to_3d, visualize_3d_point_cloud
import cv2
# Estimate depth
image_path = "images/sample.jpg"
depth_map = estimate_depth(image_path)
# Convert to 3D
x, y, z = depth_to_3d(depth_map)
fig = visualize_3d_point_cloud(x, y, z)
fig.show()- The application will display the original image and the computed depth map
- Colored scatter plot representing depth in 3D space
- Rotate, zoom, and pan using Plotly's interactive controls
| Issue | Solution |
|---|---|
cv2.VideoCapture(2) not working |
Change camera index to 0 or 1 |
| Depth map not generated | Ensure output/ exists & model is correctly installed |
| No 3D visualization | Check if depth_map.jpg is being created |
- Support for Real-Time Video Depth Estimation
- Integration with LiDAR / IMU Data for Accuracy
- Optimize Depth Processing with Faster Models
- Intel DPT-Large Model from Hugging Face
- OpenCV for Image Processing
- Streamlit for Interactive UI
- Plotly for 3D Visualization