-
Notifications
You must be signed in to change notification settings - Fork 60
add det+rec ckpt prediction pipeline #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# MindOCR串联推理 | ||
|
||
本文档介绍如何使用MindSpore训练出来的ckpt文件进行文本检测+文本识别的串联推理。 | ||
|
||
## 1. 支持的串联模型组合 | ||
|
||
| 文本检测+文本识别模型组合 | 数据集 | 推理精度 | | ||
|---------------|-------------------------------------------------------------------|---------| | ||
| DBNet+CRNN | [ICDAR15](https://rrc.cvc.uab.es/?ch=4&com=downloads)<sup>*</sup> | 55.99% | | ||
|
||
> *此处用于推理的是ICDAR15 Task 4.1中的Test Set | ||
|
||
## 2. 快速开始 | ||
|
||
### 2.1 环境配置 | ||
|
||
| 环境/设备 | 版本 | | ||
|-----------|-------| | ||
| MindSpore | >=1.9 | | ||
| Python | >=3.7 | | ||
|
||
|
||
### 2.2 参数配置 | ||
|
||
参数配置包含两部分:(1)模型yaml配置文件(2)推理脚本`tools/predict/text/predict_system.py`中的args参数。 | ||
|
||
**注意:如果在(2)中传入args参数值,则会覆盖(1)yaml配置文件中的相应参数值;否则,将会使用yaml配置文件中的默认参数值,您可以手动更新yaml配置文件中的参数值。** | ||
|
||
#### (1) yaml配置文件 | ||
|
||
检测模型和识别模型各有一个yaml配置文件。请重点关注这**两个**文件中`predict`模块内的内容,重点参数如下。 | ||
|
||
```yaml | ||
... | ||
predict: | ||
ckpt_load_path: tmp_det/best.ckpt <--- args.det_ckpt_path覆盖检测yaml, args.rec_ckpt_path覆盖识别yaml; 或手动更新该值 | ||
dataset_sink_mode: False | ||
dataset: | ||
type: PredictDataset | ||
dataset_root: path/to/dataset_root <--- args.raw_data_dir覆盖检测yaml, args.crop_save_dir覆盖识别yaml; 或手动更新该值 | ||
data_dir: ic15/det/test/ch4_test_images <--- args.raw_data_dir覆盖检测yaml, args.crop_save_dir覆盖识别yaml; 或手动更新该值 | ||
sample_ratio: 1.0 | ||
transform_pipeline: | ||
... | ||
output_columns: [ 'img_path', 'image', 'raw_img_shape' ] | ||
num_columns_to_net: 1 | ||
loader: | ||
shuffle: False | ||
batch_size: 1 | ||
... | ||
``` | ||
|
||
#### (2) args参数列表 | ||
|
||
| 参数名 | 含义 | 默认值 | | ||
|--------------------------------------|-----------------------------------------| -------- | | ||
| raw_data_dir | 待预测数据的文件夹 | - | | ||
| det_ckpt_path | 检测模型ckpt文件路径 | - | | ||
| rec_ckpt_path | 识别模型ckpt文件路径 | - | | ||
| det_config_path | 检测模型yaml配置文件路径 | 'configs/det/dbnet/db_r50_icdar15.yaml' | | ||
| rec_config_path | 识别模型yaml配置文件路径 | 'configs/rec/crnn/crnn_resnet34.yaml' | | ||
| crop_save_dir | 串联推理中检测后裁剪图片的保存文件夹,**即识别模型读取图片的文件夹** | 'predict_result/crop' | | ||
| result_save_path | 串联推理结果保存路径 | 'predict_result/ckpt_pred_result.txt' | | ||
|
||
|
||
### 2.3 推理 | ||
|
||
运行以下命令,开始串联推理。**以下传入的参数值将覆盖yaml文件中的对应参数值。** | ||
|
||
```bash | ||
python tools/predict/text/predict_system.py \ | ||
--raw_data_dir path/to/raw_data \ | ||
--det_ckpt_path path/to/detection_ckpt \ | ||
--rec_ckpt_path path/to/recognition_ckpt | ||
``` | ||
|
||
### 2.4 精度评估 | ||
|
||
推理完成后,图片名、文字检测框(points)和识别的文字(trancription)将保存在args.result_save_path。推理结果文件格式示例如下: | ||
```text | ||
img_1.jpg [{"transcription": "hello", "points": [600, 150, 715, 157, 714, 177, 599, 170]}, {"transcription": "world", "points": [622, 126, 695, 129, 694, 154, 621, 151]}, ...] | ||
img_2.jpg [{"transcription": "apple", "points": [553, 338, 706, 318, 709, 342, 556, 362]}, ...] | ||
... | ||
``` | ||
|
||
准备好串联推理图片的**ground truth文件**(格式与上述推理结果文件一致)和**推理结果文件**后,执行以下命令,开始对串联推理进行精度评估。 | ||
```bash | ||
cd deploy/eval_utils | ||
python eval_pipeline.py --gt_path path/to/gt.txt --pred_path path/to/ckpt_pred_result.txt | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
''' | ||
Inference dataset class | ||
''' | ||
import os | ||
import random | ||
from typing import Union, List | ||
|
||
from .base_dataset import BaseDataset | ||
from .transforms.transforms_factory import create_transforms, run_transforms | ||
|
||
__all__ = ['PredictDataset'] | ||
|
||
|
||
class PredictDataset(BaseDataset): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like |
||
""" | ||
Notes: | ||
1. The data file structure should be like | ||
├── img_dir | ||
│ ├── 000001.jpg | ||
│ ├── 000002.jpg | ||
│ ├── {image_file_name} | ||
""" | ||
def __init__(self, | ||
# is_train: bool = False, | ||
dataset_root: str = '', | ||
data_dir: str = '', | ||
sample_ratio: Union[List, float] = 1.0, | ||
shuffle: bool = None, | ||
transform_pipeline: List[dict] = None, | ||
output_columns: List[str] = None, | ||
**kwargs): | ||
img_dir = os.path.join(dataset_root, data_dir) | ||
super().__init__(data_dir=img_dir, label_file=None, output_columns=output_columns) | ||
self.data_list = self.load_data_list(img_dir, sample_ratio, shuffle) | ||
|
||
# create transform | ||
if transform_pipeline is not None: | ||
self.transforms = create_transforms(transform_pipeline) # , global_config=global_config) | ||
else: | ||
raise ValueError('No transform pipeline is specified!') | ||
|
||
# prefetch the data keys, to fit GeneratorDataset | ||
_data = self.data_list[0] | ||
_data = run_transforms(_data, transforms=self.transforms) | ||
_available_keys = list(_data.keys()) | ||
if output_columns is None: | ||
self.output_columns = _available_keys | ||
else: | ||
self.output_columns = [] | ||
for k in output_columns: | ||
if k in _data: | ||
self.output_columns.append(k) | ||
else: | ||
raise ValueError(f"Key '{k}' does not exist in data (available keys: {_data.keys()}). " | ||
"Please check the name or the completeness transformation pipeline.") | ||
|
||
def __getitem__(self, index): | ||
data = self.data_list[index] | ||
|
||
# perform transformation on data | ||
data = run_transforms(data, transforms=self.transforms) | ||
output_tuple = tuple(data[k] for k in self.output_columns) | ||
|
||
return output_tuple | ||
|
||
def load_data_list(self, | ||
img_dir: str, | ||
sample_ratio: List[float], | ||
shuffle: bool = False, | ||
**kwargs) -> List[dict]: | ||
# read image file name | ||
img_filenames = os.listdir(img_dir) | ||
if shuffle: | ||
img_filenames = random.sample(img_filenames, round(len(img_filenames) * sample_ratio)) | ||
else: | ||
img_filenames = img_filenames[:round(len(img_filenames) * sample_ratio)] | ||
|
||
img_paths = [{'img_path': os.path.join(img_dir, filename)} for filename in img_filenames] | ||
|
||
return img_paths |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the separate
PredictDataset
class, there is many duplicates in the config file. It's easy to make a mistake when there's a lot of repetitions.