Skip to content
This repository has been archived by the owner on Jun 15, 2022. It is now read-only.

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
hirotomusiker committed Dec 5, 2018
0 parents commit 386c5d2
Show file tree
Hide file tree
Showing 22 changed files with 1,705 additions and 0 deletions.
47 changes: 47 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
Copyright (c) 2018 DeNA Co., Ltd.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, and/or sublicense
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software; and

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


This software uses some portions from the following software under its license:

chainercv

The MIT License

Copyright (c) 2017 Preferred Networks, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

166 changes: 166 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# YOLOv3 in Pytorch
Pytorch implementation of YOLOv3

<p align="left"><img src="data/innsbruck_result.png" height="160"\> <img src="data/mountain_result.png" height="160"\></p>

## What's New
- 18/11/27 [COCO AP results of darknet (training) are reproduced with the same training conditions](#performance)
- 18/11/20 verified inference COCO AP[IoU=0.50:0.95] = 0.302 (paper: 0.310), val5k, 416x416
- 18/11/20 verified inference COCO AP[IoU=0.50] = 0.544 (paper: 0.553), val5k, 416x416

## Performance

<table><tbody>
<tr><th align="left" bgcolor=#f8f8f8> </th> <td bgcolor=white> Original (darknet) </td><td bgcolor=white> Ours (pytorch) </td></tr>
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50:0.95], inference</th> <td bgcolor=white> 0.310 </td><td bgcolor=white> 0.302 </td></tr>
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50], inference</th> <td bgcolor=white> 0.553 </td><td bgcolor=white> 0.544 </td></tr>
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50:0.95], training</th> <td bgcolor=white> 0.310 </td><td bgcolor=white> to be updated</td></tr>
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50], training</th> <td bgcolor=white> 0.553 </td><td bgcolor=white> to be updated </td></tr>
</table></tbody>

We have verified that COCO val results of darknet are reproduced in the condition where only random resizing is used:
<p align="left"><img src="data/val_comparison.png" height="280"\>

## Installation
#### Requirements

- Python 3.6+
- Numpy (verified as operable: 1.15.2)
- OpenCV
- Matplotlib
- Pytorch (verified as operable: v0.4.0)
- Cython (verified as operable: v0.29.1)
- [pycocotools](https://pypi.org/project/pycocotools/) (verified as operable: v2.0.0)

optional:
- tensorboard (>1.7.0)
- [tensorboardX](https://github.com/lanpa/tensorboardX)

#### Docker Environment

We provide a Dockerfile to build an environment that meets the above requirements.

```bash
# build docker image
$ nvidia-docker build -t yolov3-in-pytorch-image --build-arg UID=`id -u` -f docker/Dockerfile .
# create docker container and login bash
$ nvidia-docker run -it -v `pwd`:/work --name yolov3-in-pytorch-container yolov3-in-pytorch-image
docker@4d69df209f4a:/work$ python train.py --help
```

#### Download pretrained weights
download the pretrained file from the author's project page:

```bash
$ mkdir weights
$ cd weights/
$ bash ../requirements/download_weights.sh
```

#### COCO 2017 dataset:
the COCO dataset is downloaded and unzipped by:

```bash
$ bash requirements/getcoco.sh
```

## Inference with Pretrained Weights

To detect objects in the sample image, just run:
```bash
$ python demo.py --image data/mountain.png --detect_thresh 0.5 --weights_path weights/yolov3.weights
```
## Train

```bash
$ python train.py --help
usage: train.py [-h] [--cfg CFG] [--weights_path WEIGHTS_PATH] [--n_cpu N_CPU]
[--checkpoint_interval CHECKPOINT_INTERVAL]
[--eval_interval EVAL_INTERVAL] [--checkpoint CHECKPOINT]
[--checkpoint_dir CHECKPOINT_DIR] [--use_cuda USE_CUDA]
[--debug] [--tfboard TFBOARD]

optional arguments:
-h, --help show this help message and exit
--cfg CFG config file. see readme
--weights_path WEIGHTS_PATH
darknet weights file
--n_cpu N_CPU number of workers
--checkpoint_interval CHECKPOINT_INTERVAL
interval between saving checkpoints
--eval_interval EVAL_INTERVAL
interval between evaluations
--checkpoint CHECKPOINT
pytorch checkpoint file path
--checkpoint_dir CHECKPOINT_DIR
directory where checkpoint files are saved
--use_cuda USE_CUDA
--debug debug mode where only one image is trained
--tfboard TFBOARD tensorboard path for logging
```
example:
```bash
$ python train.py --weights_path weights/darknet53.conv.74 --tfboard log
```
The train configuration is written in yaml files located in config folder.
We use the following format:
```yaml
MODEL:
TYPE: YOLOv3
BACKBONE: darknet53
TRAIN:
LR: 0.001
MOMENTUM: 0.9
DECAY: 0.0005
BURN_IN: 1000 # duration (iters) for learning rate burn-in
MAXITER: 500000
STEPS: (400000, 450000) # lr-drop iter points
BATCHSIZE: 4
SUBDIVISION: 16 # num of minibatch inner-iterations
IMGSIZE: 608 # initial image size
CONFWEIGHT: 1 # not used
LOSSTYPE: l2 # loss type for w, h
IGNORETHRE: 0.7 # IoU threshold for learning conf
RANDRESIZE: True # enable random resizing
TEST:
CONFTHRE: 0.8 # not used
NMSTHRE: 0.45 # same as official darknet
IMGSIZE: 416 # this can be changed to measure acc-speed tradeoff
NUM_GPUS: 1

```
## Evaluate COCO AP
```bash
$ python train.py --cfg config/yolov3_eval.cfg --eval_interval 1 [--ckpt ckpt_path] [--weights_path weights_path]
```

## TODOs
- [x] Precision Evaluator (bbox, COCO metric)
- [x] Modify the target builder
- [x] Modify loss calculation
- [x] Training Scheduler
- [x] Weight initialization
- [x] Augmentation : Resizing
- [ ] Augmentation : Random Distortion
- [ ] Augmentation : Jitter
- [ ] Augmentation : Flip


## Paper
### YOLOv3: An Incremental Improvement
_Joseph Redmon, Ali Farhadi_ <br>

[[Paper]](https://pjreddie.com/media/files/papers/YOLOv3.pdf) [[Original Implementation]](https://github.com/pjreddie/darknet)
[[Author's Project Page]](https://pjreddie.com/darknet/yolo/)

## Credit
```
@article{yolov3,
title={YOLOv3: An Incremental Improvement},
author={Redmon, Joseph and Farhadi, Ali},
journal = {arXiv},
year={2018}
}
```
22 changes: 22 additions & 0 deletions config/yolov3_default.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
MODEL:
TYPE: YOLOv3
BACKBONE: darknet53
TRAIN:
LR: 0.001
MOMENTUM: 0.9
DECAY: 0.0005
BURN_IN: 1000
MAXITER: 500000
STEPS: (400000, 450000)
BATCHSIZE: 4
SUBDIVISION: 16
IMGSIZE: 608
CONFWEIGHT: 1
LOSSTYPE: l2
IGNORETHRE: 0.7
RANDRESIZE: True
TEST:
CONFTHRE: 0.8
NMSTHRE: 0.45
IMGSIZE: 416
NUM_GPUS: 1
22 changes: 22 additions & 0 deletions config/yolov3_eval.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
MODEL:
TYPE: YOLOv3
BACKBONE: darknet53
TRAIN:
LR: 0.00
MOMENTUM: 0.9
DECAY: 0.0005
BURN_IN: 0
MAXITER: 2
STEPS: (99, 999)
BATCHSIZE: 1
SUBDIVISION: 1
CONFWEIGHT: 1
LOSSTYPE: l2
IGNORETHRE: 0.7
IMGSIZE: 608
RANDRESIZE: False
TEST:
CONFTHRE: 0.8
NMSTHRE: 0.45
IMGSIZE: 416
NUM_GPUS: 1
Binary file added data/innsbruck.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/innsbruck_result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/mountain.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/mountain_result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/val_comparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
101 changes: 101 additions & 0 deletions dataset/cocodataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
import os
import numpy as np

import torch
from torch.utils.data import Dataset
import cv2
from pycocotools.coco import COCO

from utils.utils import *


class COCODataset(Dataset):
"""
COCO dataset class.
"""
def __init__(self, model_type, data_dir='COCO', json_file='instances_train2017.json',
name='train2017', img_size=416, min_size=1, debug=False):
"""
COCO dataset initialization. Annotation data are read into memory by COCO API.
Args:
model_type (str): model name specified in config file
data_dir (str): dataset root directory
json_file (str): COCO json file name
name (str): COCO data name (e.g. 'train2017' or 'val2017')
img_size (int): target image size after pre-processing
min_size (int): bounding boxes smaller than this are ignored
debug (bool): if True, only one data id is selected from the dataset
"""
self.data_dir = data_dir
self.json_file = json_file
self.model_type = model_type
self.coco = COCO(self.data_dir+'annotations/'+self.json_file)
self.ids = self.coco.getImgIds()
if debug:
self.ids = self.ids[1:2]
print("debug mode...", self.ids)
self.class_ids = sorted(self.coco.getCatIds())
self.name = name
self.max_labels = 50
self.img_size = img_size
self.min_size = min_size

def __len__(self):
return len(self.ids)

def __getitem__(self, index):
"""
One image / label pair for the given index is picked up \
and pre-processed.
Args:
index (int): data index
Returns:
img (numpy.ndarray): pre-processed image
padded_labels (torch.Tensor): pre-processed label data. \
The shape is :math:`[self.max_labels, 5]`. \
each label consists of [class, xc, yc, w, h]:
class (float): class index.
xc, yc (float) : center of bbox whose values range from 0 to 1.
w, h (float) : size of bbox whose values range from 0 to 1.
info_img : tuple of h, w, nh, nw, dx, dy.
h, w (int): original shape of the image
nh, nw (int): shape of the resized image without padding
dx, dy (int): pad size
id_ (int): same as the input index. Used for evaluation.
"""
id_ = self.ids[index]

anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=None)
annotations = self.coco.loadAnns(anno_ids)

# load image and preprocess
img_file = os.path.join(self.data_dir, self.name,
'{:012}'.format(id_) + '.jpg')
img = cv2.imread(img_file)

if self.json_file == 'instances_val5k.json' and img is None:
img_file = os.path.join(self.data_dir, 'train2017',
'{:012}'.format(id_) + '.jpg')
img = cv2.imread(img_file)
assert img is not None

img, info_img = preprocess(img, self.img_size)

# load labels
labels = []
for anno in annotations:
if anno['bbox'][2] > self.min_size and anno['bbox'][3] > self.min_size:
labels.append([])
labels[-1].append(self.class_ids.index(anno['category_id']))
labels[-1].extend(anno['bbox'])

padded_labels = np.zeros((self.max_labels, 5))
if len(labels) > 0:
labels = np.stack(labels)
if 'YOLO' in self.model_type:
labels = label2yolobox(labels, info_img, self.img_size)
padded_labels[range(len(labels))[:self.max_labels]
] = labels[:self.max_labels]
padded_labels = torch.from_numpy(padded_labels)

return img, padded_labels, info_img, id_
Loading

0 comments on commit 386c5d2

Please sign in to comment.