We present a new semi-parametric approach to synthesize novel views of an object from a single monocular image. First, we exploit man-made object symmetry and piece-wise planarity to integrate rich a-priori visual information into the novel viewpoint synthesis process. An Image Completion Network (ICN) then leverages 2.5D sketches rendered from a 3D CAD as guidance to generate a realistic image. In contrast to concurrent works, we do not rely solely on synthetic data but leverage instead existing datasets for 3D object detection to operate in a real-world scenario. Differently from competitors, our semi-parametric framework allows the handling of a wide range of 3D transformations. Thorough experimental analysis against state-of-the-art baselines shows the efficacy of our method both from a quantitative and a perceptive point of view.
Run the following in a fresh Python 3.6 environment to install all dependencies:
pip install -r requirements.txt
Code was tested on Ubuntu linux only (16.04, 17.04).
To run the demo code, please download and unzip all the data from this shared directory in a <data_root>
of your choice.
The entry point is run_rotate.py
. The script expects as mandatory arguments the object class, pascal dataset, pre-trained weights and 3D models dir.
For the car class it can be run as follows:
python run_rotate.py car <data_root>/pascal_car <data_root>/car_icn.pth <data_root>/car_cads --device cpu
replace chair with car to run on the chair class.
If everything went well,, you should see a GUI like the following:
The GUI is composed of two windows: the viewport and the output one.
While the focus is on the viewport, keyboard can be used to move around the object in spherical coordinates. Here the full list of commands is provided. While you move, the output shows both Image Completion Network (ICN) inputs (2.5D sketches, appearance prior) and network prediction. Please refer to Sec.3 of the paper for details.
Notice: it may happen that when starting the program, open3D does not render anything. This is an initialization issue. In case this happens, just focus on the viewport and press spacebar a couple of times until you see both windows rendered properly.
Due to its semi-parametric nature, our method can handle extreme viewpoint changes.
Manipulating radius | Manipulation elevation | Arbitrary rototranslation |
---|---|---|
Chairs backflip |
---|
Additional examples generated synthetically using our model are shown below.
Each row is generated as follows. Given an image from Pascal3D+, other examples in the same pose are randomly sampled from the dataset. Then, our method is used to transfer the appearance of the latter to the pose of the first. Eventually, generated vehicles are stiched upon the original image. For a seamless collaging, we perform a small Gaussian blur at the mask border.
Percentage of Correct Keypoints (PCK) logged in TensorBoard during training (see Sec. 4.4)
Datasets (link for download)
We release two datasets of 3D models (cars, chairs) with annotated 3D keypoints.
Currently, there are 59 annotated models for car and 73 for chair.
3D models come from ShapeNet and have been converted in .ply
format (with colors).
Each example of the datasets is composed of the following components:
- One
.ply
file containing the 3D model mesh and colors. - One
.yaml
file containing the 3D keypoints annotation - One
.jpg
image of the model thumbnail.
Annotated keypoints are the ones in Pascal3D+: 12 for cars and 10 for chairs.
Car keypoints: front wheel, back wheel, upper windshield, upper rearwindow, front light, back trunk (2x, left and right).
Chair keypoints: back upper, seat upper, seat lower, leg upper, leg lower (2x, left and right).
We believe that research should be as open as possible and we are happy if these datasets can be helpful for your research too. If you use these data, please cite our research work.
Since 3D models come from ShapeNet database, if you use this dataset you agree to ShapeNet terms of use.