Skip to content

Commit

Permalink
COCO experiments code
Browse files Browse the repository at this point in the history
  • Loading branch information
adiprasad committed Sep 9, 2018
0 parents commit 0bd490c
Show file tree
Hide file tree
Showing 23 changed files with 2,495 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "py-faster-rcnn-ft"]
path = py-faster-rcnn-ft
url = https://github.com/adiprasad/py-faster-rcnn-ft.git
96 changes: 96 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# unsup-hard-negative-mining-mscoco
This is the repository for experiments on the MSCOCO classes mentioned in the paper [Unsupervised Hard Example Mining from Videos for Improved Object Detection](https://arxiv.org/abs/1808.04285) mentioned in Section 5(Discussion).

We used the original version of [py-faster-rcnn-ft](https://github.com/DFKI-Interactive-Machine-Learning/py-faster-rcnn-ft) to fine-tune the VGG16 network pretrained on ImageNet dataset to convert it to a binary classifier for an MSCOCO category. Once we had the classifier as the backbone network of the Faster RCNN, we used it to label all the frames within a video for the presence of that particular MSCOCO category. Using the labelled frames, we were able to identify the frames containing hard negatives with the help of our algorithm. Finally, we fine tuned the network again after including the frames containing hard negatives and evaluated the network for improvements using held out validation and test sets.

For our research, we carried out experiments on two MSCOCO categories, Dog and Train.

## Steps :-

### 1. Preparing a Faster RCNN object detector on an MSCOCO category

Follow the steps mentioned in the [py-faster-rcnn-ft](https://github.com/DFKI-Interactive-Machine-Learning/py-faster-rcnn-ft) repository to prepare a VGG16 Faster RCNN network trained on an MSCOCO category of your choice.

### 2. Label the videos with detections

Scrape the web and download videos that are likely to contain a lot of instances of your chosen category. Helper code to download youtube videos can be found [here](utils/scrape-youtube/scrape_videos.py). Once the videos have been downloaded, run the detections code to label each frame of every video with bounding boxes and confidence scores for that category. See [Usage](detections_code/README.txt)

The list of videos we used is mentioned below :-

1. [Dog videos](https://docs.google.com/spreadsheets/d/1q9EeOHVYXugtmR1batdDDsb5wzWnwiQc-egLDmdWk78/#gid=1264294087)
2. [Train videos](https://docs.google.com/spreadsheets/d/1q9EeOHVYXugtmR1batdDDsb5wzWnwiQc-egLDmdWk78/#gid=994319682)

### 3. Hard negative mining

The detections code outputs a txt file containing frame wise labeling and bounding box information. Use the hard negative mining code on the detections txt file to output the frames containing hard negatives and a txt file containing the bounding box information on those frames. See [Usage](hn_mining_code/README.txt).

### 4. Include the video frames containing hard negatives in the COCO dataset and fine-tune

Use the COCO annotations editor located inside utils to include the frames containing hard negatives in MSCOCO dataset. One the frames have been included in the COCO dataset, fine-tune to get an improved network. See [Usage](utils/edit-coco-annotations/README.txt)


## Results :-


<br>

A summary of the results is mentioned below :-

<br>
<table>
<thead>
<tr>
<th><b>Category</b></th>
<th><b>Model</b></th>
<th><b>Training Iterations</b></th>
<th><b>Training Hyperparams</b></th>
<th><b>Validation set AP</b></th>
<th><b>Test set AP</b></th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan=2>Dog</td>
<td rowspan=1>Baseline</td>
<td rowspan=1>29000</td>
<td>LR : 1e-3 for 10k,<br>1e-4 for 10k-20k,<br>1e-5 for 20k-29k</td>
<td rowspan=1>26.9</td>
<td rowspan=1>25.3</td>
</tr>
<tr>
<td rowspan=1>Flickers as HN</td>
<td rowspan=1>22000</td>
<td>LR : 1e-4 for 15k,<br>1e-5 for 15k-22k</td>
<td rowspan=1>28.1</td>
<td rowspan=1>26.4</td>
</tr>
<tr>
<td rowspan=2>Train</td>
<td rowspan=1>Baseline</td>
<td rowspan=1>26000</td>
<td>LR : 1e-3,<br>stepsize : 10k,<br>lr decay : 0.1</td>
<td rowspan=1>33.9</td>
<td rowspan=1>33.2</td>
</tr>
<tr>
<td rowspan=1>Flickers as HN</td>
<td rowspan=1>24000</td>
<td>LR : 1e-3,<br>stepsize : 10k,<br>lr decay : 0.1</td>
<td rowspan=1>35.4</td>
<td rowspan=1>33.7</td>
</tr>
</tbody>
</table>

<br>
A few examples on the reduction in false positives achieved for the 'Dog' category are mentioned below :-
<br>
<br>

Baseline | Flickers as HN
:-------------------:|:--------------------:
![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame330_before.jpg) | ![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame330_after.jpg)
![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame1548_before.jpg) | ![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame1548_after.jpg)
![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame3156_before.jpg) | ![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame3156_after.jpg)
![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame9195_before.jpg) | ![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame9195_after.jpg)
![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame43837_before.jpg) | ![](https://people.cs.umass.edu/~aprasad/Detector_Results/dog_detector/images_iter_1/video1/hns_shown/frame43837_after.jpg)
32 changes: 32 additions & 0 deletions detections_code/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
To use the detections code, download all your videos inside a parent folder say 'downloaded_videos' with the following structure :-

downloaded_videos
|
|--video1
| |--video1.mkv
|
|--video2
| |--video2.mkv
|
..
..
..

Helper code is available to convert the videos present inside a folder to the above mentioned folder structure.

Steps :-

1. Specify the path of model weights to be used for running the network on the videos on line 28.
2. Decide an output_folder where the detection txt files will be placed.
3. Decide a confidence_threshold crossing which the detections will be reported.

3. Usage :-

python2 tools/test_dets_dog_detector.py \
--downloaded_videos $1 \
--video ID \
--out_folder output_folder \
--conf_thresh confidence_threshold


where ID : 1,2,3 etc. as per the videos have been saved inside the downloaded_videos folder hierarchy
26 changes: 26 additions & 0 deletions detections_code/getDogDetections.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/bin/bash
#SBATCH --job-name=detectron
#SBATCH -N1 # Ensure that all cores are on one machine
#SBATCH --partition=m40-long # Partition to submit to (serial_requeue)
#SBATCH --mem=50000 # Memory pool for all cores (see also --mem-per-cpu)
#SBATCH --output=logs/result_%A_%a.out # File to which STDOUT will be written
#SBATCH --error=logs/result_%A_%a.err # File to which STDERR will be written
#SBATCH --gres=gpu:1
#SBATCH --array=1-15
## Usage:sbatch
## # sbatch scripts/run_video_cluster. ${VIDEO_PATH} ${DATASET_NAME}
echo `pwd`
echo $1
echo "SLURM task ID: "$SLURM_ARRAY_TASK_ID
##### Experiment settings #####
VIDEO_PATH=$1/video${SLURM_ARRAY_TASK_ID} # argument to the script is the video name
OUTPUT_NAME=/mnt/nfs/scratch1/aprasad/dog_detection_outputs/${2}
MIN_SCORE=0.6
echo "Chunk path: "${VIDEO_PATH}

python2 tools/test_dets_dog_detector.py \
--video_folder $1 \
--video ${SLURM_ARRAY_TASK_ID} \
--out_folder ${OUTPUT_NAME} \
--conf_thresh $2

26 changes: 26 additions & 0 deletions detections_code/getTrainDetections.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/bin/bash
#SBATCH --job-name=detectron
#SBATCH -N1 # Ensure that all cores are on one machine
#SBATCH --partition=titanx-long # Partition to submit to (serial_requeue)
#SBATCH --mem=50000 # Memory pool for all cores (see also --mem-per-cpu)
#SBATCH --output=logs/result_%A_%a.out # File to which STDOUT will be written
#SBATCH --error=logs/result_%A_%a.err # File to which STDERR will be written
#SBATCH --gres=gpu:1
#SBATCH --array=1-26
## Usage:sbatch
## # sbatch scripts/run_video_cluster. ${VIDEO_PATH} ${DATASET_NAME}
echo `pwd`
echo $1
echo "SLURM task ID: "$SLURM_ARRAY_TASK_ID
##### Experiment settings #####
VIDEO_PATH=$1/video${SLURM_ARRAY_TASK_ID} # argument to the script is the video name
OUTPUT_NAME=/mnt/nfs/scratch1/aprasad/TRAIN_ATTEMPT_3/detections/set4/${2}
MIN_SCORE=0.6
echo "Chunk path: "${VIDEO_PATH}

python2 tools/test_dets_train_detector.py \
--video_folder $1 \
--video ${SLURM_ARRAY_TASK_ID} \
--out_folder ${OUTPUT_NAME} \
--conf_thresh $2

Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import os
from os.path import join
import pickle

parent_dir = os.getcwd() # Set the parent folder of all the chunks here

mapping_txt_file = open(join(parent_dir, "video_name_key_map.txt"), "a+")

video_name_cntr = 1
filename_mapping_dict = {}


files_in_chunk = os.listdir(parent_dir)
mkv_files = filter(lambda x : x.endswith(".mkv"), files_in_chunk)

for file_name in mkv_files:
file_path = join(parent_dir, file_name)

os.mkdir(join(parent_dir,"video{0}".format(video_name_cntr)))
new_video_path = join(parent_dir,"video{0}".format(video_name_cntr))

new_file_path = join(new_video_path, "video{0}.mkv".format(video_name_cntr))

os.rename(file_path, new_file_path)

filename_mapping_dict[file_path] = video_name_cntr
mapping_txt_file.write("{0}\t{1}\n".format(video_name_cntr, file_path))

video_name_cntr+=1


mapping_txt_file.close()
pickle.dump(filename_mapping_dict, open(join(parent_dir, "filename_mapping_dict.p"), "wb"))


Loading

0 comments on commit 0bd490c

Please sign in to comment.