Project for the course "Computer Vision" of the University of Bologna, A.Y. 2021/2022.
The repository contains 4 submodules:
- FIRe, the official model implementation presented at ICLR 2022 [paper]
- HOW, the official local descriptors implementation presented at ECCV 2020 [paper]
- ASMK, the official evaluation framework implementation presented at ICCV 2013 [paper]
- cnnimageretrieval-pytorch, a toolbox that implements training and testing for the approaches presented at ECCV 2016 [paper-1] and published in the IEEE Transactions on Pattern Analysis and Machine Intelligence journal [paper-2]
To clone it correctly, run the following command in the terminal:
git clone --recurse-submodules https://github.com/prushh/image-retrieval-fire
Note that the last submodule clones a specific branch called tested_with_fire
, created specifically to refer to the latest release, the one tested in the FIRe project.
To run the project, you will need to have Python installed. For best performance, it is recommended to have CUDA installed in the system and a GPU with at least 10GB of DRAM.
To install the dependencies for the project, you can use the following commands to create and activate a virtual environment inside the project folder and then install the dependencies from the requirements.txt file:
# Creation and activation virtual env
python3 -m virtualenv venv
source venv/bin/activate
# Installing packages
pip3 install --no-cache-dir -r requirements.txt
It is also necessary to set the PYTHONPATH
environment variable, you can use the following commands:
export PYTHONPATH=${PYTHONPATH}:$(absolute path HOW)
export PYTHONPATH=${PYTHONPATH}:$(absolute path ASMK)
export PYTHONPATH=${PYTHONPATH}:$(absolute path cnnimageretrieval-pytorch)
Our executions were performed on UniBO's HPC Cluster that provides Slurm Workload Manager for the job submission. Note that all necessary datasets are automatically downloaded during the first execution, and they take up several GBs.
python3 fire/train.py fire/train_fire.yml -e <train_experiment_folder>
python3 fire/evaluate.py fire/eval_fire.yml -e <eval_experiment_folder> -ml <train_experiment_folder>
For more details, see the README.md of the FIRe repository.
Prepare a detailed seminar on a "recent" scientific paper related to the topics of the course. This presentation should also include a detailed description/analysis of the source code associated with the scientific paper and the results achieved in it:
- If the latter is not available, the student should try to re-implement the relative model, replicating at least one experiment from the article
- You can also try simple modifications to the original solution proposed in the paper
Repeating the original solution proposed in the paper was not possible due to the CUDA out of memory error. We therefore tried the following two configurations, starting with the pretrained model:
- Freeze the CNN and train only the LIT module
- Reduce the number of hard negative images inside the tuple from 5 to 3
- The model sample tuples composed of one query image, one positive image and 5 hard negatives
- By studying Appendix A.2 of the FIRe paper, we gained insights into the impact of hard negatives on performance, and how reducing the number of hard negatives per training tuple does not excessively reduce performance
The code is distributed under the MIT License. See LICENSE for more information. It is based on code from FIRe, HOW, cirtorch and ASMK that are released under their own license.