This is the PyTorch implementation of FOPA for the following research paper. FOPA is the first discriminative approach for object placement task.
Fast Object Placement Assessment [arXiv]
Li Niu, Qingyang Liu, Zhenchen Liu, Jiangtong Li
Our FOPA has been integrated into our image composition toolbox libcom https://github.com/bcmi/libcom. Welcome to visit and try \(^▽^)/
If you want to change the backbone to transformer, you can refer to TopNet.
All the code have been tested on PyTorch 1.7.0. Follow the instructions to run the project.
First, clone the repository:
git clone git@github.com:bcmi/FOPA-Fast-Object-Placement-Assessment.git
Then, install Anaconda and create a virtual environment:
conda create -n fopa
conda activate fopa
Install PyTorch 1.7.0 (higher version should be fine):
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.2 -c pytorch
Install necessary packages:
pip install -r requirements.txt
Download and extract data from Baidu Cloud (access code: 4zf9) or Google Drive. Download the SOPA encoder from Baidu Cloud (access code: 1x3n) or Google Drive. Put them in "data/data". It should contain the following directories and files:
<data/data>
bg/ # background images
fg/ # foreground images
mask/ # foreground masks
train(test)_pair_new.json # json annotations
train(test)_pair_new.csv # csv files
SOPA.pth.tar # SOPA encoder
Download our pretrained model from Baidu Cloud (access code: uqvb) or Google Drive, and put it in './best_weight.pth'.
Before training, modify "config.py" according to your need. After that, run:
python train.py
To get the F1 score and balanced accuracy of a specified model, run:
python test.py --mode evaluate
The results obtained with our released model should be F1: 0.778302, bAcc: 0.838696.
To get the heatmaps predicted by FOPA, run:
python test.py --mode heatmap
To get the optimal composite images based on the predicted heatmaps, run:
python test.py --mode composite
For testing multi-scale foregrounds for each foreground-background pair, first run the following command to generate 'test_data_16scales.json' in './data/data' and 'test_16scales' in './data/data/fg', './data/data/mask'.
python prepare_multi_fg_scales.py
Then, to get the heatmaps of multi-scale foregrounds for each foreground-background pair, run:
python test_multi_fg_scales.py --mode heatmap
Finally, to get the composite images with top scores for each foreground-background pair, run:
python test_multi_fg_scales.py --mode composite
We show the results reported in the paper. FOPA can achieve comparable results with SOPA.
Method | F1 | bAcc |
---|---|---|
SOPA | 0.780 | 0.842 |
FOPA | 0.776 | 0.840 |
Given each background-foreground pair in the test set, we predict 16 rationality score maps for 16 foreground scales and generate composite images with top 50 rationality scores. Then, we randomly sample one from 50 generated composite images per background-foreground pair for Acc and FID evaluation, using the test scripts provided by GracoNet. The generated composite images for evaluation can be downloaded from Baidu Cloud (access code: ppft) or Google Drive. The test results of baselines and our method are shown below:
Method | Acc | FID |
---|---|---|
TERSE | 0.679 | 46.94 |
PlaceNet | 0.683 | 36.69 |
GracoNet | 0.847 | 27.75 |
IOPRE | 0.895 | 21.59 |
FOPA | 0.932 | 19.76 |
If you find this work useful for your research, please cite our paper using the following BibTeX [arxiv]:
@article{niu2022fast,
title={Fast Object Placement Assessment},
author={Niu, Li and Liu, Qingyang and Liu, Zhenchen and Li, Jiangtong},
journal={arXiv preprint arXiv:2205.14280},
year={2022}
}