Saliency prediction is a very hot topic in computer vision with many different applications. It consists in predicting where the attention is going to be received in an image or video by a human. Our work is based on a deep neural network called SalGAN (Paper) that was trained on a static dataset and it just inference each image separately without taking into account any other image or channel. With this network we've been trying to improve the saliency metrics using techniques such as depth estimation, optical flow (implemented but not trained yet), among others.
To run this model you'll have to pull from this link or build from the Dockerfile provided in this repo. Also clone this repo and copy the same folder structure.
After that, include all the desired datasets into the DATASETS folder since it'll mounted on your docker container.
To run your container type make run
and then to attach to the bash run make devel
. When you want to leave and stop the container press Ctrl+d and then make down
.
As we've been working mainly with DHF1K we are going to show all the examples with this dataset.
To run an experiment in src/salgan_dhf1k
, run python train_bce.py --path_out=name_of_trained_model --depth=True --daugm==True --coord=True
. Or you can run a set of experiments with the Makefile provided, you just have to edit the file with the chosen parameters.
If you want to check how is the model performance while training, you can use tensorboard. From inside the container in /home/code
run tensorboard --logdir=trained_models
. And then in your host, enter to localhost:6006 to check the loss functions and metrics.
To evaluate a model you should run this script src/evaluation/eval_dhf1k.py
and select the desired parameters as well. Parameters available: --model
(the name that you put on --path_out), --depth, --coord
, and also an option to save the predicted images --save
. As a results you'll get the AUC, AUCs, NSS, CC and SIM of every video and the overall average.
This experiments have been done in a GeForce GTX 1080 with 12GB RAM.