This repository has been archived by the owner on Jun 15, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 146
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 386c5d2
Showing
22 changed files
with
1,705 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
Copyright (c) 2018 DeNA Co., Ltd. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, and/or sublicense | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software; and | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | ||
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | ||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. | ||
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY | ||
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, | ||
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE | ||
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
|
||
|
||
This software uses some portions from the following software under its license: | ||
|
||
chainercv | ||
|
||
The MIT License | ||
|
||
Copyright (c) 2017 Preferred Networks, Inc. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,166 @@ | ||
# YOLOv3 in Pytorch | ||
Pytorch implementation of YOLOv3 | ||
|
||
<p align="left"><img src="data/innsbruck_result.png" height="160"\> <img src="data/mountain_result.png" height="160"\></p> | ||
|
||
## What's New | ||
- 18/11/27 [COCO AP results of darknet (training) are reproduced with the same training conditions](#performance) | ||
- 18/11/20 verified inference COCO AP[IoU=0.50:0.95] = 0.302 (paper: 0.310), val5k, 416x416 | ||
- 18/11/20 verified inference COCO AP[IoU=0.50] = 0.544 (paper: 0.553), val5k, 416x416 | ||
|
||
## Performance | ||
|
||
<table><tbody> | ||
<tr><th align="left" bgcolor=#f8f8f8> </th> <td bgcolor=white> Original (darknet) </td><td bgcolor=white> Ours (pytorch) </td></tr> | ||
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50:0.95], inference</th> <td bgcolor=white> 0.310 </td><td bgcolor=white> 0.302 </td></tr> | ||
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50], inference</th> <td bgcolor=white> 0.553 </td><td bgcolor=white> 0.544 </td></tr> | ||
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50:0.95], training</th> <td bgcolor=white> 0.310 </td><td bgcolor=white> to be updated</td></tr> | ||
<tr><th align="left" bgcolor=#f8f8f8> COCO AP[IoU=0.50], training</th> <td bgcolor=white> 0.553 </td><td bgcolor=white> to be updated </td></tr> | ||
</table></tbody> | ||
|
||
We have verified that COCO val results of darknet are reproduced in the condition where only random resizing is used: | ||
<p align="left"><img src="data/val_comparison.png" height="280"\> | ||
|
||
## Installation | ||
#### Requirements | ||
|
||
- Python 3.6+ | ||
- Numpy (verified as operable: 1.15.2) | ||
- OpenCV | ||
- Matplotlib | ||
- Pytorch (verified as operable: v0.4.0) | ||
- Cython (verified as operable: v0.29.1) | ||
- [pycocotools](https://pypi.org/project/pycocotools/) (verified as operable: v2.0.0) | ||
|
||
optional: | ||
- tensorboard (>1.7.0) | ||
- [tensorboardX](https://github.com/lanpa/tensorboardX) | ||
|
||
#### Docker Environment | ||
|
||
We provide a Dockerfile to build an environment that meets the above requirements. | ||
|
||
```bash | ||
# build docker image | ||
$ nvidia-docker build -t yolov3-in-pytorch-image --build-arg UID=`id -u` -f docker/Dockerfile . | ||
# create docker container and login bash | ||
$ nvidia-docker run -it -v `pwd`:/work --name yolov3-in-pytorch-container yolov3-in-pytorch-image | ||
docker@4d69df209f4a:/work$ python train.py --help | ||
``` | ||
|
||
#### Download pretrained weights | ||
download the pretrained file from the author's project page: | ||
|
||
```bash | ||
$ mkdir weights | ||
$ cd weights/ | ||
$ bash ../requirements/download_weights.sh | ||
``` | ||
|
||
#### COCO 2017 dataset: | ||
the COCO dataset is downloaded and unzipped by: | ||
|
||
```bash | ||
$ bash requirements/getcoco.sh | ||
``` | ||
|
||
## Inference with Pretrained Weights | ||
|
||
To detect objects in the sample image, just run: | ||
```bash | ||
$ python demo.py --image data/mountain.png --detect_thresh 0.5 --weights_path weights/yolov3.weights | ||
``` | ||
## Train | ||
|
||
```bash | ||
$ python train.py --help | ||
usage: train.py [-h] [--cfg CFG] [--weights_path WEIGHTS_PATH] [--n_cpu N_CPU] | ||
[--checkpoint_interval CHECKPOINT_INTERVAL] | ||
[--eval_interval EVAL_INTERVAL] [--checkpoint CHECKPOINT] | ||
[--checkpoint_dir CHECKPOINT_DIR] [--use_cuda USE_CUDA] | ||
[--debug] [--tfboard TFBOARD] | ||
|
||
optional arguments: | ||
-h, --help show this help message and exit | ||
--cfg CFG config file. see readme | ||
--weights_path WEIGHTS_PATH | ||
darknet weights file | ||
--n_cpu N_CPU number of workers | ||
--checkpoint_interval CHECKPOINT_INTERVAL | ||
interval between saving checkpoints | ||
--eval_interval EVAL_INTERVAL | ||
interval between evaluations | ||
--checkpoint CHECKPOINT | ||
pytorch checkpoint file path | ||
--checkpoint_dir CHECKPOINT_DIR | ||
directory where checkpoint files are saved | ||
--use_cuda USE_CUDA | ||
--debug debug mode where only one image is trained | ||
--tfboard TFBOARD tensorboard path for logging | ||
``` | ||
example: | ||
```bash | ||
$ python train.py --weights_path weights/darknet53.conv.74 --tfboard log | ||
``` | ||
The train configuration is written in yaml files located in config folder. | ||
We use the following format: | ||
```yaml | ||
MODEL: | ||
TYPE: YOLOv3 | ||
BACKBONE: darknet53 | ||
TRAIN: | ||
LR: 0.001 | ||
MOMENTUM: 0.9 | ||
DECAY: 0.0005 | ||
BURN_IN: 1000 # duration (iters) for learning rate burn-in | ||
MAXITER: 500000 | ||
STEPS: (400000, 450000) # lr-drop iter points | ||
BATCHSIZE: 4 | ||
SUBDIVISION: 16 # num of minibatch inner-iterations | ||
IMGSIZE: 608 # initial image size | ||
CONFWEIGHT: 1 # not used | ||
LOSSTYPE: l2 # loss type for w, h | ||
IGNORETHRE: 0.7 # IoU threshold for learning conf | ||
RANDRESIZE: True # enable random resizing | ||
TEST: | ||
CONFTHRE: 0.8 # not used | ||
NMSTHRE: 0.45 # same as official darknet | ||
IMGSIZE: 416 # this can be changed to measure acc-speed tradeoff | ||
NUM_GPUS: 1 | ||
|
||
``` | ||
## Evaluate COCO AP | ||
```bash | ||
$ python train.py --cfg config/yolov3_eval.cfg --eval_interval 1 [--ckpt ckpt_path] [--weights_path weights_path] | ||
``` | ||
|
||
## TODOs | ||
- [x] Precision Evaluator (bbox, COCO metric) | ||
- [x] Modify the target builder | ||
- [x] Modify loss calculation | ||
- [x] Training Scheduler | ||
- [x] Weight initialization | ||
- [x] Augmentation : Resizing | ||
- [ ] Augmentation : Random Distortion | ||
- [ ] Augmentation : Jitter | ||
- [ ] Augmentation : Flip | ||
|
||
|
||
## Paper | ||
### YOLOv3: An Incremental Improvement | ||
_Joseph Redmon, Ali Farhadi_ <br> | ||
|
||
[[Paper]](https://pjreddie.com/media/files/papers/YOLOv3.pdf) [[Original Implementation]](https://github.com/pjreddie/darknet) | ||
[[Author's Project Page]](https://pjreddie.com/darknet/yolo/) | ||
|
||
## Credit | ||
``` | ||
@article{yolov3, | ||
title={YOLOv3: An Incremental Improvement}, | ||
author={Redmon, Joseph and Farhadi, Ali}, | ||
journal = {arXiv}, | ||
year={2018} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
MODEL: | ||
TYPE: YOLOv3 | ||
BACKBONE: darknet53 | ||
TRAIN: | ||
LR: 0.001 | ||
MOMENTUM: 0.9 | ||
DECAY: 0.0005 | ||
BURN_IN: 1000 | ||
MAXITER: 500000 | ||
STEPS: (400000, 450000) | ||
BATCHSIZE: 4 | ||
SUBDIVISION: 16 | ||
IMGSIZE: 608 | ||
CONFWEIGHT: 1 | ||
LOSSTYPE: l2 | ||
IGNORETHRE: 0.7 | ||
RANDRESIZE: True | ||
TEST: | ||
CONFTHRE: 0.8 | ||
NMSTHRE: 0.45 | ||
IMGSIZE: 416 | ||
NUM_GPUS: 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
MODEL: | ||
TYPE: YOLOv3 | ||
BACKBONE: darknet53 | ||
TRAIN: | ||
LR: 0.00 | ||
MOMENTUM: 0.9 | ||
DECAY: 0.0005 | ||
BURN_IN: 0 | ||
MAXITER: 2 | ||
STEPS: (99, 999) | ||
BATCHSIZE: 1 | ||
SUBDIVISION: 1 | ||
CONFWEIGHT: 1 | ||
LOSSTYPE: l2 | ||
IGNORETHRE: 0.7 | ||
IMGSIZE: 608 | ||
RANDRESIZE: False | ||
TEST: | ||
CONFTHRE: 0.8 | ||
NMSTHRE: 0.45 | ||
IMGSIZE: 416 | ||
NUM_GPUS: 1 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
import os | ||
import numpy as np | ||
|
||
import torch | ||
from torch.utils.data import Dataset | ||
import cv2 | ||
from pycocotools.coco import COCO | ||
|
||
from utils.utils import * | ||
|
||
|
||
class COCODataset(Dataset): | ||
""" | ||
COCO dataset class. | ||
""" | ||
def __init__(self, model_type, data_dir='COCO', json_file='instances_train2017.json', | ||
name='train2017', img_size=416, min_size=1, debug=False): | ||
""" | ||
COCO dataset initialization. Annotation data are read into memory by COCO API. | ||
Args: | ||
model_type (str): model name specified in config file | ||
data_dir (str): dataset root directory | ||
json_file (str): COCO json file name | ||
name (str): COCO data name (e.g. 'train2017' or 'val2017') | ||
img_size (int): target image size after pre-processing | ||
min_size (int): bounding boxes smaller than this are ignored | ||
debug (bool): if True, only one data id is selected from the dataset | ||
""" | ||
self.data_dir = data_dir | ||
self.json_file = json_file | ||
self.model_type = model_type | ||
self.coco = COCO(self.data_dir+'annotations/'+self.json_file) | ||
self.ids = self.coco.getImgIds() | ||
if debug: | ||
self.ids = self.ids[1:2] | ||
print("debug mode...", self.ids) | ||
self.class_ids = sorted(self.coco.getCatIds()) | ||
self.name = name | ||
self.max_labels = 50 | ||
self.img_size = img_size | ||
self.min_size = min_size | ||
|
||
def __len__(self): | ||
return len(self.ids) | ||
|
||
def __getitem__(self, index): | ||
""" | ||
One image / label pair for the given index is picked up \ | ||
and pre-processed. | ||
Args: | ||
index (int): data index | ||
Returns: | ||
img (numpy.ndarray): pre-processed image | ||
padded_labels (torch.Tensor): pre-processed label data. \ | ||
The shape is :math:`[self.max_labels, 5]`. \ | ||
each label consists of [class, xc, yc, w, h]: | ||
class (float): class index. | ||
xc, yc (float) : center of bbox whose values range from 0 to 1. | ||
w, h (float) : size of bbox whose values range from 0 to 1. | ||
info_img : tuple of h, w, nh, nw, dx, dy. | ||
h, w (int): original shape of the image | ||
nh, nw (int): shape of the resized image without padding | ||
dx, dy (int): pad size | ||
id_ (int): same as the input index. Used for evaluation. | ||
""" | ||
id_ = self.ids[index] | ||
|
||
anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=None) | ||
annotations = self.coco.loadAnns(anno_ids) | ||
|
||
# load image and preprocess | ||
img_file = os.path.join(self.data_dir, self.name, | ||
'{:012}'.format(id_) + '.jpg') | ||
img = cv2.imread(img_file) | ||
|
||
if self.json_file == 'instances_val5k.json' and img is None: | ||
img_file = os.path.join(self.data_dir, 'train2017', | ||
'{:012}'.format(id_) + '.jpg') | ||
img = cv2.imread(img_file) | ||
assert img is not None | ||
|
||
img, info_img = preprocess(img, self.img_size) | ||
|
||
# load labels | ||
labels = [] | ||
for anno in annotations: | ||
if anno['bbox'][2] > self.min_size and anno['bbox'][3] > self.min_size: | ||
labels.append([]) | ||
labels[-1].append(self.class_ids.index(anno['category_id'])) | ||
labels[-1].extend(anno['bbox']) | ||
|
||
padded_labels = np.zeros((self.max_labels, 5)) | ||
if len(labels) > 0: | ||
labels = np.stack(labels) | ||
if 'YOLO' in self.model_type: | ||
labels = label2yolobox(labels, info_img, self.img_size) | ||
padded_labels[range(len(labels))[:self.max_labels] | ||
] = labels[:self.max_labels] | ||
padded_labels = torch.from_numpy(padded_labels) | ||
|
||
return img, padded_labels, info_img, id_ |
Oops, something went wrong.