Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add script converts yolo-style to coco format #161

Merged
merged 58 commits into from
Oct 21, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
f38127b
the script converts yolo-style to coco format draft 1
xin-li-67 Oct 13, 2022
44601c0
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
4542ee7
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
65386f9
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
54bceb6
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
0d4d9ab
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
60c8a91
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
e4d2eaa
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
101cb39
typo fixed and update
xin-li-67 Oct 14, 2022
58fa514
encaped and modified yolo2coco
xin-li-67 Oct 14, 2022
8f97cb6
more usage instructions of yolo2coco
xin-li-67 Oct 14, 2022
9d98c42
Update docs/zh_cn/user_guides/useful_tools.md
xin-li-67 Oct 14, 2022
257a985
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 14, 2022
0265345
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
682e665
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 14, 2022
2ca277f
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 14, 2022
2dc0528
Update docs/zh_cn/user_guides/useful_tools.md
xin-li-67 Oct 14, 2022
659aedc
Update docs/zh_cn/user_guides/useful_tools.md
xin-li-67 Oct 14, 2022
75937ba
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 14, 2022
a4d3fe2
final draft of yolo2coco
xin-li-67 Oct 14, 2022
6961b35
Merge branch 'dev' of github.com:open-mmlab/mmyolo into yolo2coco
xin-li-67 Oct 14, 2022
546c115
final draft of yolo2coco
xin-li-67 Oct 14, 2022
818303f
final draft of yolo2coco
xin-li-67 Oct 14, 2022
60cbae5
Merge branch 'yolo2coco' of github.com:xin-li-67/mmyolo into yolo2coco
xin-li-67 Oct 14, 2022
1a0d21a
revise the script and remove -split indicator
xin-li-67 Oct 14, 2022
efe3bd5
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 15, 2022
cb2549b
update the structural example of the root path
xin-li-67 Oct 15, 2022
aac8537
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 16, 2022
21e00d6
fix typo on check_existence
xin-li-67 Oct 16, 2022
0cb43fb
Update docs/zh_cn/user_guides/useful_tools.md
xin-li-67 Oct 16, 2022
f0d778f
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 16, 2022
b869eb8
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 16, 2022
018a8f6
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 16, 2022
3a3204c
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 16, 2022
bcdac33
update the commits
xin-li-67 Oct 16, 2022
18ead51
rewrite split_by_files part and add copy_image part
xin-li-67 Oct 19, 2022
d4c854e
add instruction on using absolute image paths
xin-li-67 Oct 19, 2022
e439ad3
add instruction on using absolute image paths
xin-li-67 Oct 19, 2022
3dea55f
Update docs/en/user_guides/useful_tools.md
RangeKing Oct 20, 2022
1be362e
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 20, 2022
8913f61
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 20, 2022
9676c64
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 20, 2022
5e4cd57
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 20, 2022
6f015fc
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 20, 2022
a1fc8ed
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 20, 2022
423198e
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 20, 2022
7092c78
add basic statistic func
xin-li-67 Oct 20, 2022
525b390
add basic statistic func
xin-li-67 Oct 20, 2022
1d3d797
revise the script
xin-li-67 Oct 20, 2022
b1c1c74
remove copy&move image func
xin-li-67 Oct 21, 2022
a819d07
Update docs/zh_cn/user_guides/useful_tools.md
xin-li-67 Oct 21, 2022
d5c84e9
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 21, 2022
43d00b7
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 21, 2022
c873127
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 21, 2022
f8a23f7
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 21, 2022
e157b3b
Update docs/en/user_guides/useful_tools.md
xin-li-67 Oct 21, 2022
df1c708
Update tools/dataset_converters/yolo2coco.py
xin-li-67 Oct 21, 2022
f6995ef
revise image_dir
xin-li-67 Oct 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
revise the script and remove -split indicator
  • Loading branch information
xin-li-67 committed Oct 14, 2022
commit 1a0d21a2e76ae0104baf7d71528865be3fa2b9ec
8 changes: 6 additions & 2 deletions docs/en/user_guides/useful_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,12 +135,16 @@ Instructions:
└── $ROOT_PATH
├── class.txt
├── labels
├── a.txt
├── b.txt
├── images
├── a.jpg
├── b.png
└── ...
```

2. `split` indicates whether you need to format the dataset based on the existing `train`, `val`, and `text` categories. It is not used by default. If you need to use it, please ensure the corresponding `train.txt`, ` val.txt`, and `test.txt` must exist under `image-dir`. Otherwise, the script will fail to run.
3. By default, the script will create a folder called `coco_format` in the `image-dir` directory where stores the converted JSON file. If `split` is not specified, the output file is `result.json`. If `split` is specified, three corresponding files called `train.json`, `val.json`, and `test.json` will be generated.
2. The script will automatically check if `train.txt`, `val.txt`, and `test.txt` are already existed under `image-dir`. If these files are located, the script will organize the dataset accordingly. Otherwise, the script will convert the dataset in one file.
3. By default, the script will create a folder called `coco_format` in the `image-dir` directory where stores the converted JSON file. If `train.txt`, `val.txt`, and `test.txt` are not found, the output file is `result.json`. Otherwise, three corresponding files called `train.json`, `val.json`, and `test.json` will be generated.

## Download Dataset

Expand Down
12 changes: 8 additions & 4 deletions docs/zh_cn/user_guides/useful_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,20 +128,24 @@ python tools/dataset_converters/yolo2coco.py --image-dir /path/to/the/root/dir/o

使用说明:

1. `image-dir` 是待转换的yolo格式数据集的根目录,内应包含 `images` 、 `labels` 和 `class.txt` 文件, `class.txt` 是当前dataset对应的类的声明,一行一个。如下所示:

1. `image-dir` 是待转换的yolo格式数据集的根目录,内应包含 `images` 、 `labels` 和 `class.txt` 文件, `class.txt` 是当前dataset对应的类的声明,一行一个类别。
xin-li-67 marked this conversation as resolved.
Show resolved Hide resolved
`image_dir` 结构如下例所示:

```bash
.
└── $ROOT_PATH
xin-li-67 marked this conversation as resolved.
Show resolved Hide resolved
├── class.txt
├── labels
├── a.txt
├── b.txt
├── images
├── a.jpg
├── b.png
└── ...
```

2. `split` 代表是否需要在转换中按已有的 `train` 、 `val` 和 `text` 类别进行整理,默认不使用,如果需要使用,请确保对应的 `train.txt` 、 `val.txt` 和 `test.txt` 也要存在于 `image-dir` 下,否则将报错
3. 脚本会默认在 `image-dir` 目录下创建 `coco_format` 文件夹并将转换结果存在这里。如果未指明 `split` ,输出文件即为 `result.json`,如果指明 `split` ,则会生成对应的 `train.json` 、 `val.json` 和 `test.json` 三个文件。
2. 脚本会检测 `image-dir` 下是否已有的 `train.txt` 、 `val.txt` 和 `test.txt` 。若检测到三个文件,则会按照类别进行整理, 否则默认不需要分类。故请确保对应的 `train.txt` 、 `val.txt` 和 `test.txt` 要在 `image-dir`
3. 脚本会默认在 `image-dir` 目录下创建 `coco_format` 文件夹并将转换结果存在这里。如果在 `image-dir` 下没找到分类文件,输出文件即为 `result.json`,反之,则会生成对应的 `train.json` 、 `val.json` 和 `test.json` 三个文件。

## 数据集下载

Expand Down
51 changes: 30 additions & 21 deletions tools/dataset_converters/yolo2coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,23 @@

Usage:
$ python yolo2coco.py --image-dir /path/to/dataset # the root dir
xin-li-67 marked this conversation as resolved.
Show resolved Hide resolved
[--split] # if splits

Note:
1. Before running this script, please make sure the root directory
of your dataset is formatted in the following struction:
.
└── $ROOT_PATH
├── class.txt
├── labels
├── images
└── ...
2. `split` is not used by default. If you need to use it, please ensure
the corresponding`train.txt`, ` val.txt`, and `test.txt` must exist under
`image-dir`. Otherwise, the script will fail to run.
├── class.txt
├── labels
├── a.txt
├── b.txt
├── images
├── a.jpg
├── b.png
└── ...
2. The script will automatically check whether the corresponding
`train.txt`, ` val.txt`, and `test.txt` exist under `image-dir` or not.
If these files are detected, the script will organize the dataset.
3. Once the script finishes, the result files will be saved in the
directory named 'coco_format' in the root directory of your dataset.
"""
Expand Down Expand Up @@ -96,14 +99,12 @@ def organize_by_existing_files(image_dir: str):
return image_list[0], image_list[1], image_list[2]


def convert_yolo_to_coco(image_dir: str, split: bool = False):
def convert_yolo_to_coco(image_dir: str):
"""Convert annotations from yolo style to coco style.

Args:
image_dir (str): the root directory of your datasets which contains
labels, images, classes.txt, etc
split (bool): whether to organize the datasets based on existing
train.txt, val.txt, and test.txt
"""
print(f'Start to load existing images and annotations from {image_dir}')
check_existance(image_dir)
Expand All @@ -114,14 +115,27 @@ def convert_yolo_to_coco(image_dir: str, split: bool = False):
check_existance(yolo_label_dir)
check_existance(yolo_image_dir)
check_existance(yolo_class_txt)
print(f'All necessary files are located at {image_dir}')

train_txt_path = osp.join(image_dir, 'train.txt')
val_txt_path = osp.join(image_dir, 'val.txt')
test_txt_path = osp.join(image_dir, 'test.txt')
print(f'Checking if train.txt, val.txt, and test.txt are in {image_dir}')
if osp.exists(train_txt_path) and osp.exists(val_txt_path) and osp.exists(
test_txt_path):
print(
'these files are located, need to organize the data accordingly.')
to_categorize = True
else:
print('these files are not located, no need to organize separately.')
to_categorize = False

with open(yolo_class_txt) as f:
classes = f.read().strip().split()

indices = os.listdir(yolo_image_dir)

if split:
print('Start to work based to existing train, test, and val')
if to_categorize:
train_dataset = {'images': [], 'annotations': [], 'categories': []}
val_dataset = {'images': [], 'annotations': [], 'categories': []}
test_dataset = {'images': [], 'annotations': [], 'categories': []}
Expand All @@ -133,7 +147,6 @@ def convert_yolo_to_coco(image_dir: str, split: bool = False):
test_dataset['categories'].append({'id': i, 'name': cls})
train_img, val_img, test_img = organize_by_existing_files(image_dir)
else:
print('Start to work on all data')
dataset = {'images': [], 'annotations': [], 'categories': []}
for i, cls in enumerate(classes, 0):
dataset['categories'].append({'id': i, 'name': cls})
Expand All @@ -148,7 +161,7 @@ def convert_yolo_to_coco(image_dir: str, split: bool = False):
img_name = osp.splitext(image)[0]
img_info_dict, H, W = get_image_info(yolo_image_dir, idx, image)
xin-li-67 marked this conversation as resolved.
Show resolved Hide resolved

if split:
if to_categorize:
if image in train_img:
dataset = train_dataset
elif image in val_img:
Expand Down Expand Up @@ -177,7 +190,7 @@ def convert_yolo_to_coco(image_dir: str, split: bool = False):
if not osp.exists(res_folder):
os.makedirs(res_folder)

if split:
if to_categorize:
for category in ['train', 'val', 'test']:
out_file = osp.join(image_dir, f'coco_format/{category}.json')
xin-li-67 marked this conversation as resolved.
Show resolved Hide resolved
print(f'Saving converted annotations to {out_file}')
Expand All @@ -201,9 +214,5 @@ def convert_yolo_to_coco(image_dir: str, split: bool = False):
type=str,
required=True,
help='dataset directory with ./images and ./labels, classes.txt, etc.')
parser.add_argument(
'--split',
action='store_true',
help='convert based on existing train.txt, val.txt, and test.txt')
arg = parser.parse_args()
convert_yolo_to_coco(arg.image_dir, arg.split)
convert_yolo_to_coco(arg.image_dir)