The code for Unsupervised Learning of Multi-Frame Optical Flow with Occlusions.
We propose a framework for unsupervised learning of optical flow and occlusions over multiple frames. Our multi-frame, occlusion-sensitive formulation outperforms existing unsupervised two-frame methods and even produces results on par with some fully supervised methods.
More details can be found on our Project Page.
The pytorch reimplentation can be found here
Overview:
You need to have Torch.
The code was tested with Torch7, CUDA 9.0, cudnn 7.0. When using CUDA 9.0 you will run into problems following the Torch installation guide. Execute the following command before calling install.sh to resolve the problem:
export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
For cudnn 7.0, you will also need to clone and install the Revision 7 branch of the cudnn.torch repository:
git clone https://github.com/soumith/cudnn.torch -b R7
cd cudnn.torch
luarocks make cudnn-scm-1.rockspec
Install other required packages:
cd extras/spybhwd
luarocks make
cd ../stnbhwd
luarocks make
back2future = require('back2future')
computeFlow = back2future.init('Ours-Soft-ft-KITTI')
We provide three pre-trained models from our paper. Download and store the models into the models folder:
im1 = image.load('samples/frame_0009.png' )
im2 = image.load('samples/frame_0010.png' )
im3 = image.load('samples/frame_0011.png' )
flow, fwd_occ, bwd_occ = computeFlow(im1, im2, im3)
flowX = require('flowExtensions')
flowX.writeFLO('samples/flow.flo', flow:float())
floImg = flowX.xy2rgb(flow[{1,{},{}}], flow[{2,{},{}}])
image.save('samples/flow.png', floImg)
image.save('samples/fwd_occ.png', fwd_occ * 255)
image.save('samples/bwd_occ.png', bwd_occ * 255)
More details in flowExtensions.
We provide the following files to read in images and gt flow from RoamingImages, KITTI and Sintel:
NOTE: Replace [PATH] in each file with the root path of the corresponding dataset.
- datasets/RoamingImages.dat: Our RoamingImages dataset
- datasets/Kitti2015.dat: KITTI 2015 Multiview Training set (excluding frames 9-12)
- datasets/Sintel.dat: Sintel Clean + Final Training
Pre-training using the hard constraint network on RoamingImages with linear motion:
th main.lua -cache checkpoints -expName Hard_Constraint -dataset RoamingImages -ground_truth \
-pme 1 -pme_criterion OBCC -smooth_flow 2
Fine-tuning 'Hard_Constraint' model after 10 iterations using the soft constraint network on KITTI:
th main.lua -cache checkpoints -expName Soft_KITTI -dataset Kitti2015 -LR 0.00001 \
-pme 2 -pme_criterion OBGCC -pme_alpha 0 -pme_beta 1 -pme_gamma 1 \
-smooth_flow 0.1 -smooth_second_order -const_vel 0.0001 -past_flow -convert_to_soft \
-retrain checkpoints/Hard_Constraint/model_10.t7 -optimState checkpoints/Hard_Constraint/optimState_10.t7
Fine-tuning 'Hard_Constraint' model after 10 iterations using the soft constraint network on Sintel:
th main.lua -cache checkpoints -expName Soft_Sintel -dataset Sintel -ground_truth -LR 0.00001 \
-pme 4 -pme_criterion OBGCC -pme_alpha 1 -pme_beta 0 -pme_gamma 0 \
-smooth_flow 0.1 -smooth_second_order -const_vel 0.0001 -past_flow -convert_to_soft \
-retrain checkpoints/Hard_Constraint/model_10.t7 -optimState checkpoints/Hard_Constraint/optimState_10.t7
The complete list of parameters and the default values can be found in opts.lua.
We also provide our pre-trained model on RoamingImages 'Ours-Hard' and the corresponding state of the optimizer for individual fine-tuning: [model], [state]
-retrain [PATH]/RoamingImages_H.t7 -optimState [PATH]/RoamingImages_H_optimState.t7
- Our code is based on anuragranj/spynet.
- The warping code is based on qassemoquab/stnbhwd.
- The images in
samples
are from KITTI 2015 dataset:
A. Geiger, P. Lenz, C. Stiller, R. Urtasun: "Vision meets robotics: The KITTI dataset." International Journal of Robotics Research (IJRR). (2013)
M. Menze, A. Geiger: "Object scene flow for autonomous vehicles." In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). (2015) - Some parts of
flowExtensions.lua
are adapted from marcoscoffier/optical-flow with help from fguney.
Free for non-commercial and scientific research purposes. For commercial use, please contact ps-license@tue.mpg.de.
Check LICENSE file for details.
@inproceedings{Janai2018ECCV,
title = {Unsupervised Learning of Multi-Frame Optical Flow with Occlusions },
author = {Janai, Joel and G{"u}ney, Fatma and Ranjan, Anurag and Black, Michael J. and Geiger, Andreas},
booktitle = {European Conference on Computer Vision (ECCV)},
volume = {Lecture Notes in Computer Science, vol 11220},
pages = {713--731},
publisher = {Springer, Cham},
month = sep,
year = {2018},
month_numeric = {9}
}