facebookresearch · adeebakausar · Feb 13, 2018 · Feb 13, 2018 · Feb 13, 2018 · Feb 15, 2018
diff --git a/README.md b/README.md
@@ -1,111 +1,119 @@
-# Detectron
+# Detectron Transfer Learning with PASCAL VOC 2007 dataset
 
-Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including [Mask R-CNN](https://arxiv.org/abs/1703.06870). It is written in Python and powered by the [Caffe2](https://github.com/caffe2/caffe2) deep learning framework.
+** Detectron implemented  several object detecton algorithms. All the algorithms are trained on coco 2014 data set which has 80 categories. I want to fine tune the faster-rcnn with FPN on pascal voc 2007 dataset which has only 20 categories. The same way can be used to fine tune your own model on a new dataset **
 
-At FAIR, Detectron has enabled numerous research projects, including: [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144), [Mask R-CNN](https://arxiv.org/abs/1703.06870), [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333), [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002), [Non-local Neural Networks](https://arxiv.org/abs/1711.07971), [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370), and [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
-
-<div align="center">
-  <img src="demo/output/33823288584_1d21cf0a26_k_example_output.jpg" width="700px" />
-  <p>Example Mask R-CNN output.</p>
-</div>
-
-## Introduction
-
-The goal of Detectron is to provide a high-quality, high-performance
-codebase for object detection *research*. It is designed to be flexible in order
-to support rapid implementation and evaluation of novel research. Detectron
-includes implementations of the following object detection algorithms:
-
-- [Mask R-CNN](https://arxiv.org/abs/1703.06870) -- *Marr Prize at ICCV 2017*
-- [RetinaNet](https://arxiv.org/abs/1708.02002) -- *Best Student Paper Award at ICCV 2017*
-- [Faster R-CNN](https://arxiv.org/abs/1506.01497)
-- [RPN](https://arxiv.org/abs/1506.01497)
-- [Fast R-CNN](https://arxiv.org/abs/1504.08083)
-- [R-FCN](https://arxiv.org/abs/1605.06409)
-
-using the following backbone network architectures:
-
-- [ResNeXt{50,101,152}](https://arxiv.org/abs/1611.05431)
-- [ResNet{50,101,152}](https://arxiv.org/abs/1512.03385)
-- [Feature Pyramid Networks](https://arxiv.org/abs/1612.03144) (with ResNet/ResNeXt)
-- [VGG16](https://arxiv.org/abs/1409.1556)
-
-Additional backbone architectures may be easily implemented. For more details about these models, please see [References](#references) below.
-
-## License
-
-Detectron is released under the [Apache 2.0 license](https://github.com/facebookresearch/detectron/blob/master/LICENSE). See the [NOTICE](https://github.com/facebookresearch/detectron/blob/master/NOTICE) file for additional details.
-
-## Citing Detectron
-
-If you use Detectron in your research or wish to refer to the baseline results published in the [Model Zoo](MODEL_ZOO.md), please use the following BibTeX entry.
+## 1. Setup caffe2 and Detectron and run the Detectron demo successfully.
+I will refer the Detectron directory as $DETECTRON
 
+## 2. Download the pre-trained model
+The code will download the models automatically. But my internet is slow and I'd like to download them before I run the code.
+Becase I'm gonna using the ResNet-50 as the backbone, so I need to download the ResNet and faster_rcnn_R-50-FPN model.
 ```
-@misc{Detectron2018,
-  author =       {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
-                  Piotr Doll\'{a}r and Kaiming He},
-  title =        {Detectron},
-  howpublished = {\url{https://github.com/facebookresearch/detectron}},
-  year =         {2018}
-}
+wget https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl /tmp/detectron/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl
+wget https://s3-us-west-2.amazonaws.com/detectron/36225732/12_2017_baselines/mask_rcnn_R-50-FPN_2x.yaml.08_43_08.gDqBz9zS/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
 ```
 
-## Model Zoo and Baselines
+## 3. Prepare configuration file.
+### a. Copy the sample configure file from $DETECTRON/configs/getting_started
+```
+cd $DETECTORN
+mkdir experiments && cd experiments
+cp ../configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
+```
+### b. Change the configuration file
+```
+MODEL:
+  TYPE: generalized_rcnn
+  CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
+  NUM_CLASSES: 21
+  FASTER_RCNN: True
+```
+The pascal voc 2007 has only 20 classes plus one background class. So the NUM_CLASSES is set to 21.
 
-We provide a large set of baseline results and trained models available for download in the [Detectron Model Zoo](MODEL_ZOO.md).
+```
+TRAIN:
+  SNAPSHOT_ITERS: 5000
+  WEIGHTS: /tmp/detectron/detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
+  DATASETS: ('voc_2007_train',)
+```
+Change the WEIGHTS value to where you just place in step 2.
 
-## Installation
+## 4. Download pascal voc2007 and coco format annotations.
+Refer [data readme file](https://github.com/facebookresearch/Detectron/blob/master/lib/datasets/data/README.md) to prepare the pascal data set.
 
-Please find installation instructions for Caffe2 and Detectron in [`INSTALL.md`](INSTALL.md).
+The code support pascal data set has bug, the code in $DETECTRON/lib/datasets/dataset_catalog.py should be changed as following:
+```
+    'voc_2007_train': {
+        IM_DIR:
+            _DATA_DIR + '/VOC2007/JPEGImages',
+        ANN_FN:
+            _DATA_DIR + '/VOC2007/annotations/pascal_train2007.json',
+        DEVKIT_DIR:
+            _DATA_DIR + '/VOC2007/VOCdevkit2007'
+    },
+    'voc_2007_test': {
+        IM_DIR:
+            _DATA_DIR + '/VOC2007/JPEGImages',
+        ANN_FN:
+            _DATA_DIR + '/VOC2007/annotations/pascal_test2007.json',
+        DEVKIT_DIR:
+            _DATA_DIR + '/VOC2007/VOCdevkit2007'
+    },
 
-## Quick Start: Using Detectron
+```
 
-After installation, please see [`GETTING_STARTED.md`](GETTING_STARTED.md) for brief tutorials covering inference and training with Detectron.
+## 5. Change the cls_score and bbox_pred name to prevent error when train_net.py load weights
+In lib/modeling/fast_rcnn_heads.py, change all cls_score to cls_score_voc, bbox_pred to bbox_pred_voc.
 
-## Getting Help
+## 6. Run command to begin training.
+```
+python2 tools/train_net.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml  OUTPUT_DIR experiments/output
+```
 
-To start, please check the [troubleshooting](INSTALL.md#troubleshooting) section of our installation instructions as well as our [FAQ](FAQ.md). If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.
+## 7. Copy the final model just trained.
+```
+mkdir -p /tmp/detectron-download-cache/voc2007/
+cp experiments/output/train/voc_2007_train/generalized_rcnn/model_iter49999.pkl /tmp/detectron-download-cache/voc2007/model_final.pkl
 
-If bugs are found, **we appreciate pull requests** (including adding Q&A's to `FAQ.md` and improving our installation instructions and troubleshooting documents). Please see [CONTRIBUTING.md](CONTRIBUTING.md) for more information about contributing to Detectron.
+```
+## 8. Infer some images psacal 2007 from test dataset
+```
+python2 tools/infer_simple.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml \
+    --output-dir /tmp/detectron-visualizations --wts /tmp/detectron-download-cache/voc2007/model_final.pkl \
+    demo2
+```
+Unfortunately, I found all persons are labeld as bird. This may be caused by the json datasets which are not corrrectly converted.
 
-## References
+## 9. Run test_net.py on pascal 2007 test dataset.
+```
+python2 tools/test_net.py \
+    --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml \
+    TEST.WEIGHTS /tmp/detectron-download-cache/voc2007/model_final.pkl \
+    NUM_GPUS 1
+```
+The test report the AP and mAP:
+```
+INFO voc_dataset_evaluator.py: 127: AP for aeroplane = 0.8095
+INFO voc_dataset_evaluator.py: 127: AP for bicycle = 0.8042
+INFO voc_dataset_evaluator.py: 127: AP for bird = 0.7086
+INFO voc_dataset_evaluator.py: 127: AP for boat = 0.6418
+INFO voc_dataset_evaluator.py: 127: AP for bottle = 0.6861
+INFO voc_dataset_evaluator.py: 127: AP for bus = 0.8822
+INFO voc_dataset_evaluator.py: 127: AP for car = 0.8794
+INFO voc_dataset_evaluator.py: 127: AP for cat = 0.8621
+INFO voc_dataset_evaluator.py: 127: AP for chair = 0.5876
+INFO voc_dataset_evaluator.py: 127: AP for cow = 0.7799
+INFO voc_dataset_evaluator.py: 127: AP for diningtable = 0.7404
+INFO voc_dataset_evaluator.py: 127: AP for dog = 0.8497
+INFO voc_dataset_evaluator.py: 127: AP for horse = 0.8855
+INFO voc_dataset_evaluator.py: 127: AP for motorbike = 0.7912
+INFO voc_dataset_evaluator.py: 127: AP for person = 0.7931
+INFO voc_dataset_evaluator.py: 127: AP for pottedplant = 0.5142
+INFO voc_dataset_evaluator.py: 127: AP for sheep = 0.7950
+INFO voc_dataset_evaluator.py: 127: AP for sofa = 0.7457
+INFO voc_dataset_evaluator.py: 127: AP for train = 0.7956
+INFO voc_dataset_evaluator.py: 127: AP for tvmonitor = 0.6960
+INFO voc_dataset_evaluator.py: 130: Mean AP = 0.7624
 
-- [Data Distillation: Towards Omni-Supervised Learning](https://arxiv.org/abs/1712.04440).
-  Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, and Kaiming He.
-  Tech report, arXiv, Dec. 2017.
-- [Learning to Segment Every Thing](https://arxiv.org/abs/1711.10370).
-  Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick.
-  Tech report, arXiv, Nov. 2017.
-- [Non-Local Neural Networks](https://arxiv.org/abs/1711.07971).
-  Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He.
-  Tech report, arXiv, Nov. 2017.
-- [Mask R-CNN](https://arxiv.org/abs/1703.06870).
-  Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick.
-  IEEE International Conference on Computer Vision (ICCV), 2017.
-- [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002).
-  Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár.
-  IEEE International Conference on Computer Vision (ICCV), 2017.
-- [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
-  Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He.
-  Tech report, arXiv, June 2017.
-- [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/abs/1704.07333).
-  Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He.
-  Tech report, arXiv, Apr. 2017.
-- [Feature Pyramid Networks for Object Detection](https://arxiv.org/abs/1612.03144).
-  Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie.
-  IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
-- [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
-  Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He.
-  IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
-- [R-FCN: Object Detection via Region-based Fully Convolutional Networks](http://arxiv.org/abs/1605.06409).
-  Jifeng Dai, Yi Li, Kaiming He, and Jian Sun.
-  Conference on Neural Information Processing Systems (NIPS), 2016.
-- [Deep Residual Learning for Image Recognition](http://arxiv.org/abs/1512.03385).
-  Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
-  IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
-- [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](http://arxiv.org/abs/1506.01497)
-  Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
-  Conference on Neural Information Processing Systems (NIPS), 2015.
-- [Fast R-CNN](http://arxiv.org/abs/1504.08083).
-  Ross Girshick.
-  IEEE International Conference on Computer Vision (ICCV), 2015.
+```
+
diff --git a/demo2/000012.jpg b/demo2/000012.jpg
diff --git a/demo2/000014.jpg b/demo2/000014.jpg
diff --git a/demo2/000017.jpg b/demo2/000017.jpg
diff --git a/demo2/000018.jpg b/demo2/000018.jpg
diff --git a/demo2/000019.jpg b/demo2/000019.jpg
diff --git a/demo2/000021.jpg b/demo2/000021.jpg
diff --git a/demo2/000022.jpg b/demo2/000022.jpg
diff --git a/demo2/000023.jpg b/demo2/000023.jpg
diff --git a/demo2/000025.jpg b/demo2/000025.jpg
diff --git a/demo2/000028.jpg b/demo2/000028.jpg
diff --git a/demo2/000030.jpg b/demo2/000030.jpg
diff --git a/demo2/000031.jpg b/demo2/000031.jpg
diff --git a/demo2/000032.jpg b/demo2/000032.jpg
diff --git a/demo2/000034.jpg b/demo2/000034.jpg
diff --git a/demo2/000035.jpg b/demo2/000035.jpg
diff --git a/demo2/000036.jpg b/demo2/000036.jpg
diff --git a/experiments/_init_paths.py b/experiments/_init_paths.py
@@ -0,0 +1,20 @@
+#!/usr/bin/env python2
+# -*- coding: utf-8 -*-
+"""
+Created on Tue Feb  6 08:39:36 2018
+
+@author: roy
+"""
+
+"""Insert /home/roy/projects/caffe2/build to PYTHONPATH"""
+
+import sys
+
+pt = '/home/roy/projects/caffe2/build'
+
+
+def add_path(path):
+    if path not in sys.path:
+        sys.path.insert(0, path)
+
+add_path(pt)
diff --git a/experiments/demo.txt b/experiments/demo.txt
@@ -0,0 +1,9 @@
+python2 tools/infer_simple.py     --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml     --output-dir /tmp/detectron-visualizations     --image-ext jpg     --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl     demo
+
+python2 tools/infer_simple.py \
+    --cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
+    --output-dir /tmp/detectron-visualizations \
+    --image-ext jpg \
+    --wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl    demo
+
+python2 tools/train_net.py --cfg experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml  OUTPUT_DIR experiments/output
diff --git a/experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml b/experiments/e2e_faster_rcnn_resnet-50-FPN_pascal2007.yaml
@@ -0,0 +1,55 @@
+MODEL:
+  TYPE: generalized_rcnn
+  CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
+  NUM_CLASSES: 21
+  FASTER_RCNN: True
+NUM_GPUS: 1
+SOLVER:
+  WEIGHT_DECAY: 0.0001
+  LR_POLICY: steps_with_decay
+  BASE_LR: 0.0025
+  GAMMA: 0.1
+  MAX_ITER: 50000
+  STEPS: [0, 30000, 40000]
+  # Equivalent schedules with...
+  # 1 GPU:
+  #   BASE_LR: 0.0025
+  #   MAX_ITER: 60000
+  #   STEPS: [0, 30000, 40000]
+  # 2 GPUs:
+  #   BASE_LR: 0.005
+  #   MAX_ITER: 30000
+  #   STEPS: [0, 15000, 20000]
+  # 4 GPUs:
+  #   BASE_LR: 0.01
+  #   MAX_ITER: 15000
+  #   STEPS: [0, 7500, 10000]
+  # 8 GPUs:
+  #   BASE_LR: 0.02
+  #   MAX_ITER: 7500
+  #   STEPS: [0, 3750, 5000]
+FPN:
+  FPN_ON: True
+  MULTILEVEL_ROIS: True
+  MULTILEVEL_RPN: True
+FAST_RCNN:
+  ROI_BOX_HEAD: fast_rcnn_heads.add_roi_2mlp_head
+  ROI_XFORM_METHOD: RoIAlign
+  ROI_XFORM_RESOLUTION: 7
+  ROI_XFORM_SAMPLING_RATIO: 2
+TRAIN:
+  SNAPSHOT_ITERS: 5000
+  WEIGHTS: /tmp/detectron/detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
+  DATASETS: ('voc_2007_train',)
+  SCALES: (500,)
+  MAX_SIZE: 833
+  BATCH_SIZE_PER_IM: 256
+  RPN_PRE_NMS_TOP_N: 2000  # Per FPN level
+TEST:
+  DATASETS: ('voc_2007_test',)
+  SCALES: (500,)
+  MAX_SIZE: 833
+  NMS: 0.5
+  RPN_PRE_NMS_TOP_N: 1000  # Per FPN level
+  RPN_POST_NMS_TOP_N: 1000
+OUTPUT_DIR: .