Skip to content
This repository was archived by the owner on Nov 21, 2023. It is now read-only.

Can we train the Faster RCNN with customer data set? #942

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 103 additions & 95 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,111 +1,119 @@
# Detectron
# Detectron Transfer Learning with PASCAL VOC 2007 dataset

Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including [Mask R-CNN](https://arxiv.org/abs/1703.06870). It is written in Python and powered by the [Caffe2](https://github.com/caffe2/caffe2) deep learning framework.
** Detectron implemented several object detecton algorithms. All the algorithms are trained on coco 2014 data set which has 80 categories. I want to fine tune the faster-rcnn with FPN on pascal voc 2007 dataset which has only 20 categories. The same way can be used to fine tune your own model on a new dataset **

At FAIR, Detectron has enabled numerous research projects, including: [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144), [Mask R-CNN](https://arxiv.org/abs/1703.06870), [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333), [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002), [Non-local Neural Networks](https://arxiv.org/abs/1711.07971), [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370), and [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).

<div align="center">
<img src="demo/output/33823288584_1d21cf0a26_k_example_output.jpg" width="700px" />
<p>Example Mask R-CNN output.</p>
</div>

## Introduction

The goal of Detectron is to provide a high-quality, high-performance
codebase for object detection *research*. It is designed to be flexible in order
to support rapid implementation and evaluation of novel research. Detectron
includes implementations of the following object detection algorithms:

- [Mask R-CNN](https://arxiv.org/abs/1703.06870) -- *Marr Prize at ICCV 2017*
- [RetinaNet](https://arxiv.org/abs/1708.02002) -- *Best Student Paper Award at ICCV 2017*
- [Faster R-CNN](https://arxiv.org/abs/1506.01497)
- [RPN](https://arxiv.org/abs/1506.01497)
- [Fast R-CNN](https://arxiv.org/abs/1504.08083)
- [R-FCN](https://arxiv.org/abs/1605.06409)

using the following backbone network architectures:

- [ResNeXt{50,101,152}](https://arxiv.org/abs/1611.05431)
- [ResNet{50,101,152}](https://arxiv.org/abs/1512.03385)
- [Feature Pyramid Networks](https://arxiv.org/abs/1612.03144) (with ResNet/ResNeXt)
- [VGG16](https://arxiv.org/abs/1409.1556)

Additional backbone architectures may be easily implemented. For more details about these models, please see [References](#references) below.

## License

Detectron is released under the [Apache 2.0 license](https://github.com/facebookresearch/detectron/blob/master/LICENSE). See the [NOTICE](https://github.com/facebookresearch/detectron/blob/master/NOTICE) file for additional details.

## Citing Detectron

If you use Detectron in your research or wish to refer to the baseline results published in the [Model Zoo](MODEL_ZOO.md), please use the following BibTeX entry.
## 1. Setup caffe2 and Detectron and run the Detectron demo successfully.
I will refer the Detectron directory as $DETECTRON

## 2. Download the pre-trained model
The code will download the models automatically. But my internet is slow and I'd like to download them before I run the code.
Becase I'm gonna using the ResNet-50 as the backbone, so I need to download the ResNet and faster_rcnn_R-50-FPN model.
```
@misc{Detectron2018,
author = {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
Piotr Doll\'{a}r and Kaiming He},
title = {Detectron},
howpublished = {\url{https://github.com/facebookresearch/detectron}},
year = {2018}
}
wget https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl /tmp/detectron/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl
wget https://s3-us-west-2.amazonaws.com/detectron/36225732/12_2017_baselines/mask_rcnn_R-50-FPN_2x.yaml.08_43_08.gDqBz9zS/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
```

## Model Zoo and Baselines
## 3. Prepare configuration file.
### a. Copy the sample configure file from $DETECTRON/configs/getting_started
```
cd $DETECTORN
mkdir experiments && cd experiments
cp ../configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
```
### b. Change the configuration file
```
MODEL:
TYPE: generalized_rcnn
CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
NUM_CLASSES: 21
FASTER_RCNN: True
```
The pascal voc 2007 has only 20 classes plus one background class. So the NUM_CLASSES is set to 21.

We provide a large set of baseline results and trained models available for download in the [Detectron Model Zoo](MODEL_ZOO.md).
```
TRAIN:
SNAPSHOT_ITERS: 5000
WEIGHTS: /tmp/detectron/detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
DATASETS: ('voc_2007_train',)
```
Change the WEIGHTS value to where you just place in step 2.

## Installation
## 4. Download pascal voc2007 and coco format annotations.
Refer [data readme file](https://github.com/facebookresearch/Detectron/blob/master/lib/datasets/data/README.md) to prepare the pascal data set.

Please find installation instructions for Caffe2 and Detectron in [`INSTALL.md`](INSTALL.md).
The code support pascal data set has bug, the code in $DETECTRON/lib/datasets/dataset_catalog.py should be changed as following:
```
'voc_2007_train': {
IM_DIR:
_DATA_DIR + '/VOC2007/JPEGImages',
ANN_FN:
_DATA_DIR + '/VOC2007/annotations/pascal_train2007.json',
DEVKIT_DIR:
_DATA_DIR + '/VOC2007/VOCdevkit2007'
},
'voc_2007_test': {
IM_DIR:
_DATA_DIR + '/VOC2007/JPEGImages',
ANN_FN:
_DATA_DIR + '/VOC2007/annotations/pascal_test2007.json',
DEVKIT_DIR:
_DATA_DIR + '/VOC2007/VOCdevkit2007'
},

## Quick Start: Using Detectron
```

After installation, please see [`GETTING_STARTED.md`](GETTING_STARTED.md) for brief tutorials covering inference and training with Detectron.
## 5. Change the cls_score and bbox_pred name to prevent error when train_net.py load weights
In lib/modeling/fast_rcnn_heads.py, change all cls_score to cls_score_voc, bbox_pred to bbox_pred_voc.

## Getting Help
## 6. Run command to begin training.
```
python2 tools/train_net.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml OUTPUT_DIR experiments/output
```

To start, please check the [troubleshooting](INSTALL.md#troubleshooting) section of our installation instructions as well as our [FAQ](FAQ.md). If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.
## 7. Copy the final model just trained.
```
mkdir -p /tmp/detectron-download-cache/voc2007/
cp experiments/output/train/voc_2007_train/generalized_rcnn/model_iter49999.pkl /tmp/detectron-download-cache/voc2007/model_final.pkl

If bugs are found, **we appreciate pull requests** (including adding Q&A's to `FAQ.md` and improving our installation instructions and troubleshooting documents). Please see [CONTRIBUTING.md](CONTRIBUTING.md) for more information about contributing to Detectron.
```
## 8. Infer some images psacal 2007 from test dataset
```
python2 tools/infer_simple.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml \
--output-dir /tmp/detectron-visualizations --wts /tmp/detectron-download-cache/voc2007/model_final.pkl \
demo2
```
Unfortunately, I found all persons are labeld as bird. This may be caused by the json datasets which are not corrrectly converted.

## References
## 9. Run test_net.py on pascal 2007 test dataset.
```
python2 tools/test_net.py \
--cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml \
TEST.WEIGHTS /tmp/detectron-download-cache/voc2007/model_final.pkl \
NUM_GPUS 1
```
The test report the AP and mAP:
```
INFO voc_dataset_evaluator.py: 127: AP for aeroplane = 0.8095
INFO voc_dataset_evaluator.py: 127: AP for bicycle = 0.8042
INFO voc_dataset_evaluator.py: 127: AP for bird = 0.7086
INFO voc_dataset_evaluator.py: 127: AP for boat = 0.6418
INFO voc_dataset_evaluator.py: 127: AP for bottle = 0.6861
INFO voc_dataset_evaluator.py: 127: AP for bus = 0.8822
INFO voc_dataset_evaluator.py: 127: AP for car = 0.8794
INFO voc_dataset_evaluator.py: 127: AP for cat = 0.8621
INFO voc_dataset_evaluator.py: 127: AP for chair = 0.5876
INFO voc_dataset_evaluator.py: 127: AP for cow = 0.7799
INFO voc_dataset_evaluator.py: 127: AP for diningtable = 0.7404
INFO voc_dataset_evaluator.py: 127: AP for dog = 0.8497
INFO voc_dataset_evaluator.py: 127: AP for horse = 0.8855
INFO voc_dataset_evaluator.py: 127: AP for motorbike = 0.7912
INFO voc_dataset_evaluator.py: 127: AP for person = 0.7931
INFO voc_dataset_evaluator.py: 127: AP for pottedplant = 0.5142
INFO voc_dataset_evaluator.py: 127: AP for sheep = 0.7950
INFO voc_dataset_evaluator.py: 127: AP for sofa = 0.7457
INFO voc_dataset_evaluator.py: 127: AP for train = 0.7956
INFO voc_dataset_evaluator.py: 127: AP for tvmonitor = 0.6960
INFO voc_dataset_evaluator.py: 130: Mean AP = 0.7624

- [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, and Kaiming He.
Tech report, arXiv, Dec. 2017.
- [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370).
Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick.
Tech report, arXiv, Nov. 2017.
- [Non-Local Neural Networks](https://arxiv.org/abs/1711.07971).
Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
Tech report, arXiv, Nov. 2017.
- [Mask R-CNN](https://arxiv.org/abs/1703.06870).
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick.
IEEE International Conference on Computer Vision (ICCV), 2017.
- [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár.
IEEE International Conference on Computer Vision (ICCV), 2017.
- [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He.
Tech report, arXiv, June 2017.
- [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333).
Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He.
Tech report, arXiv, Apr. 2017.
- [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144).
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- [R-FCN: Object Detection via Region-based Fully Convolutional Networks](http://arxiv.org/abs/1605.06409).
Jifeng Dai, Yi Li, Kaiming He, and Jian Sun.
Conference on Neural Information Processing Systems (NIPS), 2016.
- [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](http://arxiv.org/abs/1506.01497)
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
Conference on Neural Information Processing Systems (NIPS), 2015.
- [Fast R-CNN](http://arxiv.org/abs/1504.08083).
Ross Girshick.
IEEE International Conference on Computer Vision (ICCV), 2015.
```

Binary file added demo2/000012.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000014.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000017.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000018.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000019.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000021.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000022.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000023.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000025.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000028.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000030.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000031.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000032.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000034.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000035.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo2/000036.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions experiments/_init_paths.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
Created on Tue Feb 6 08:39:36 2018

@author: roy
"""

"""Insert /home/roy/projects/caffe2/build to PYTHONPATH"""

import sys

pt = '/home/roy/projects/caffe2/build'


def add_path(path):
if path not in sys.path:
sys.path.insert(0, path)

add_path(pt)
9 changes: 9 additions & 0 deletions experiments/demo.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
python2 tools/infer_simple.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir /tmp/detectron-visualizations --image-ext jpg --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl demo

python2 tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl demo

python2 tools/train_net.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml OUTPUT_DIR experiments/output
55 changes: 55 additions & 0 deletions experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
MODEL:
TYPE: generalized_rcnn
CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
NUM_CLASSES: 21
FASTER_RCNN: True
NUM_GPUS: 1
SOLVER:
WEIGHT_DECAY: 0.0001
LR_POLICY: steps_with_decay
BASE_LR: 0.0025
GAMMA: 0.1
MAX_ITER: 50000
STEPS: [0, 30000, 40000]
# Equivalent schedules with...
# 1 GPU:
# BASE_LR: 0.0025
# MAX_ITER: 60000
# STEPS: [0, 30000, 40000]
# 2 GPUs:
# BASE_LR: 0.005
# MAX_ITER: 30000
# STEPS: [0, 15000, 20000]
# 4 GPUs:
# BASE_LR: 0.01
# MAX_ITER: 15000
# STEPS: [0, 7500, 10000]
# 8 GPUs:
# BASE_LR: 0.02
# MAX_ITER: 7500
# STEPS: [0, 3750, 5000]
FPN:
FPN_ON: True
MULTILEVEL_ROIS: True
MULTILEVEL_RPN: True
FAST_RCNN:
ROI_BOX_HEAD: fast_rcnn_heads.add_roi_2mlp_head
ROI_XFORM_METHOD: RoIAlign
ROI_XFORM_RESOLUTION: 7
ROI_XFORM_SAMPLING_RATIO: 2
TRAIN:
SNAPSHOT_ITERS: 5000
WEIGHTS: /tmp/detectron/detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
DATASETS: ('voc_2007_train',)
SCALES: (500,)
MAX_SIZE: 833
BATCH_SIZE_PER_IM: 256
RPN_PRE_NMS_TOP_N: 2000 # Per FPN level
TEST:
DATASETS: ('voc_2007_test',)
SCALES: (500,)
MAX_SIZE: 833
NMS: 0.5
RPN_PRE_NMS_TOP_N: 1000 # Per FPN level
RPN_POST_NMS_TOP_N: 1000
OUTPUT_DIR: .
Loading