Randomly-connected neural networks for self-supervised monocular depth estimation Depth-Uncertainty Model
This repository holds the code for the depth-uncertainty model adapted Sam Tukra and Stamatia Giannarou in Randomly-connected neural networks for self-supervised depth estimation. This model uses predictive techniques to learn the aleatoric uncertainty:
Where left-most is the original image, 2nd from the left is the predicted disparity, 2nd from the right is the uncertainty and the right-most image is the true error.
To use this package, you will need Python 3.6 or higher. Using an NVIDIA GPU, such as an RTX6000 is recommended.
Download the repository from GitHub and create a virtual environment and activate it:
python -m venv venv
. venv/bin/activateInstall all the packages from pip
python -m pip install -r requirements.txtTo use this code, you can either:
- Run the example scripts
- Run one of the python
mainprograms. - Use the packages directly
There are three datasets currently available with this package:
- CityScapes
leftImg8bitandrightImg8bitdatasets. - Hamlyn da Vinci Dataset.
- SCARED Dataset.
Note, the example scripts and main programs find the datasets by assuming there is a datasets folder in your home directory, and within that are datasets named cityscapes, da-vinci or scared respectively. To alter this, you can set the path of the directory it seeks datasets from. In the future, this will be replaced with a flag for the complete dataset path.
Within the model package, there is a separate module for the:
- Encoder architecture (
encoder.py). - Decoder architecture (
decoder.py). - Complete model, i.e. the encoder and decoder (
model.py). - Discriminator, when using the
--adversarialflag in the main program (discriminator.py).
There are also a package for all the PyTorch layers used in the model, and a module for creating and loading the graphs used in the randomly-connected encoder architecture.
Within the train package, there are modules for:
- Training and checkpointing the model (
train.py). - Evaluating and saving example images of the model (
evaluate.py). - The loss functions needed to train the model (
loss.py). - The transforms, based on
torchvision.transformsthat are compatible with the datasets.
To create the model, discriminator and loss functions, each class takes in keyword arguments. Therefore, it is possible to unpack a dictionary of the key values into the class constructor.
See the config.yml file for how to create the model architecture from kwargs.
To change the loss function used for learning uncertainty, modify error_loss_config.loss_type to be either l1 for absolute error loss, or bayesian for the laplacian log-likelihood.
The names l1 and bayesian should be updated to reflect what their true loss functions are actually calculating.
graphs/
nodes_5_seed_42/ # default graphs used in the encoder
loaders/
__init__.py
cityscapes.py # CityScapes 8-bit Dataset
davinci.py # Hamlyn da Vinci Dataset
scared.py # SCARED Dataset
model/
__init__.py
layers/ # layers, blocks and stages used in the model
attention.py
decoder.py
encoder.py
decoder.py
discriminator.py # for adversarial training
encoder.py
graph.py # for building the graph structure within the encoder
model.py
scripts/
demo_train.sh # showcase the model training
finetune_test.sh # example for finetuning model
local_test.sh # test training without cuda
macos_setup.sh # set up GPU training with MPS
package_model.sh # zip model package
parallel_test.sh # test training the model with DDP
rcs_parallel_job.rcs # train model using DDP on HPC
rcs_serial_job.pbs # train model without DDP on HPC
remove_model.sh # (paperspace only) delete package
slurm_job.sh # train model on SLURM
train/
__init__.py
evaluate.py
loss.py # all loss functions used in training
train.py
transforms.py # modified torchvision transforms
utils.py # useful functions and type definitions
.gitignore
config.yml # The config file for building the model, disc and loss
main.py # for training the model without DDP
parallel_main.py # for training the model with DDP
readme.md
requirements.txt
results.png