- Course: GIST Computer Vision (EC4216)
- Project Type: Supervised Depth Refinement Implementation Individual Coding Assignment
In this project, we implemented Supervised Depth Refinement (SDR) to accurately predict complete depth from the given Sparse Depth, RGB Image, and Surface Normal. The SDR model takes sparse depth and RGB as input and outputs depth and normal, where the model learns weights under the supervision of Sparse Ground Truth and Normal Ground Truth to perform depth refinement.
The baseline model consists of HoleFiller, UNet, and Depth2Normal modules, among which only UNet is trainable. It is trained to minimize Sparse depth loss and Normal loss. To improve the performance of the baseline model, we additionally designed and applied two boosting strategies: ArchBoost (Architecture Boost) and DataBoost (Data-driven Boost).
ArchBoost enhances performance through three structural improvements: (i) Smooth Hole-filling, (ii) Average Pooling Depth2Normal, and (iii) Auxiliary Depth Loss. On the other hand, DataBoost improves performance with two data-driven approaches: (i) Transfer Learning for robustness, and (ii) Sample Data Augmentation.
note that torch version should be matched with cuda
python -m venv venv
source venv/bin/activate
pip install numpy matplotlib tqdm # (optional) ipykernel
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128 # me: cuda12.8
run data augmentation
python augmentation.py
run to train and evaluate
python main.py
