Skip to content

Commit

Permalink
update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
an1018 committed Aug 23, 2022
1 parent d5d78b4 commit 9c424ff
Show file tree
Hide file tree
Showing 13 changed files with 192 additions and 124 deletions.
3 changes: 2 additions & 1 deletion __init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@
__version__ = paddleocr.VERSION
__all__ = [
'PaddleOCR', 'PPStructure', 'draw_ocr', 'draw_structure_result',
'save_structure_res', 'download_with_progressbar'
'save_structure_res', 'download_with_progressbar', 'sorted_layout_boxes',
'convert_info_docx'
]
6 changes: 3 additions & 3 deletions deploy/hubserving/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@ PaddleOCR提供2种服务部署方式:

# 基于PaddleHub Serving的服务部署

hubserving服务部署目录下包括文本检测、文本方向分类,文本识别、文本检测+文本方向分类+文本识别3阶段串联,表格识别、PP-Structure和版面分析七种服务包,请根据需求选择相应的服务包进行安装和启动。目录结构如下:
hubserving服务部署目录下包括文本检测、文本方向分类,文本识别、文本检测+文本方向分类+文本识别3阶段串联,版面分析、表格识别和PP-Structure七种服务包,请根据需求选择相应的服务包进行安装和启动。目录结构如下:
```
deploy/hubserving/
└─ ocr_cls 文本方向分类模块服务包
└─ ocr_det 文本检测模块服务包
└─ ocr_rec 文本识别模块服务包
└─ ocr_system 文本检测+文本方向分类+文本识别串联服务包
└─ structure_layout 版面分析服务包
└─ structure_table 表格识别服务包
└─ structure_system PP-Structure服务包
└─ structure_layout 版面分析服务包
```

每个服务包下包含3个文件。以2阶段串联服务包为例,目录如下:
Expand All @@ -42,9 +42,9 @@ deploy/hubserving/ocr_system/
```
## 1. 近期更新

* 2022.08.23 新增版面分析服务。
* 2022.05.05 新增PP-OCRv3检测和识别模型。
* 2022.03.30 新增PP-Structure和表格识别两种服务。
* 2022.08.23 新增版面分析服务。

## 2. 快速启动服务
以下步骤以检测+识别2阶段串联服务为例,如果只需要检测服务或识别服务,替换相应文件路径即可。
Expand Down
4 changes: 2 additions & 2 deletions deploy/hubserving/readme_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@ PaddleOCR provides 2 service deployment methods:

# Service deployment based on PaddleHub Serving

The hubserving service deployment directory includes seven service packages: text detection, text angle class, text recognition, text detection+text angle class+text recognition three-stage series connection, table recognition, PP-Structure and layout analysis. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:
The hubserving service deployment directory includes seven service packages: text detection, text angle class, text recognition, text detection+text angle class+text recognition three-stage series connection, layout analysis, table recognition and PP-Structure. Please select the corresponding service package to install and start service according to your needs. The directory is as follows:
```
deploy/hubserving/
└─ ocr_det text detection module service package
└─ ocr_cls text angle class module service package
└─ ocr_rec text recognition module service package
└─ ocr_system text detection+text angle class+text recognition three-stage series connection service package
└─ structure_layout layout analysis service package
└─ structure_table table recognition service package
└─ structure_system PP-Structure service package
└─ structure_layout layout analysis service package
```

Each service pack contains 3 files. Take the 2-stage series connection service package as an example, the directory is as follows:
Expand Down
64 changes: 47 additions & 17 deletions paddleocr.py
Original file line number Diff line number Diff line change
Expand Up @@ -562,7 +562,7 @@ def __init__(self, **kwargs):
params.table_model_dir,
os.path.join(BASE_DIR, 'whl', 'table'), table_model_config['url'])
layout_model_config = get_model_config(
'STRUCTURE', params.structure_version, 'layout', 'ch')
'STRUCTURE', params.structure_version, 'layout', lang)
params.layout_model_dir, layout_url = confirm_model_dir_url(
params.layout_model_dir,
os.path.join(BASE_DIR, 'whl', 'layout'), layout_model_config['url'])
Expand All @@ -584,7 +584,7 @@ def __init__(self, **kwargs):
logger.debug(params)
super().__init__(params)

def __call__(self, img, return_ocr_result_in_table=False):
def __call__(self, img, return_ocr_result_in_table=False, img_idx=0):
if isinstance(img, str):
# download net image
if img.startswith('http'):
Expand All @@ -602,7 +602,8 @@ def __call__(self, img, return_ocr_result_in_table=False):
if isinstance(img, np.ndarray) and len(img.shape) == 2:
img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)

res, _ = super().__call__(img, return_ocr_result_in_table)
res, _ = super().__call__(
img, return_ocr_result_in_table, img_idx=img_idx)
return res


Expand Down Expand Up @@ -637,25 +638,54 @@ def main():
for line in result:
logger.info(line)
elif args.type == 'structure':
result = engine(img_path)
save_structure_res(result, args.output, img_name)
img, flag_gif, flag_pdf = check_and_read(img_path)
if not flag_gif and not flag_pdf:
img = cv2.imread(img_path)

if args.recovery:
try:
from ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx
img = cv2.imread(img_path)
if not flag_pdf:
if img is None:
logger.error("error in loading image:{}".format(image_file))
continue
img_paths = [[img_path, img]]
else:
img_paths = []
for index, pdf_img in enumerate(img):
os.makedirs(
os.path.join(args.output, img_name), exist_ok=True)
pdf_img_path = os.path.join(args.output, img_name, img_name
+ '_' + str(index) + '.jpg')
cv2.imwrite(pdf_img_path, pdf_img)
img_paths.append([pdf_img_path, pdf_img])

all_res = []
for index, (new_img_path, img) in enumerate(img_paths):
logger.info('processing {}/{} page:'.format(index + 1,
len(img_paths)))
new_img_name = os.path.basename(new_img_path).split('.')[0]
result = engine(new_img_path, img_idx=index)
save_structure_res(result, args.output, img_name, index)

if args.recovery and result != []:
from copy import deepcopy
from ppstructure.recovery.recovery_to_doc import sorted_layout_boxes
h, w, _ = img.shape
res = sorted_layout_boxes(result, w)
convert_info_docx(img, res, args.output, img_name,
result_cp = deepcopy(result)
result_sorted = sorted_layout_boxes(result_cp, w)
all_res += result_sorted

for item in result:
item.pop('img')
item.pop('res')
logger.info(item)
logger.info('result save to {}'.format(args.output))

if args.recovery and all_res != []:
try:
from ppstructure.recovery.recovery_to_doc import convert_info_docx
convert_info_docx(img, all_res, args.output, img_name,
args.save_pdf)
except Exception as ex:
logger.error(
"error in layout recovery image:{}, err msg: {}".format(
img_name, ex))
continue

for item in result:
item.pop('img')
item.pop('res')
logger.info(item)
logger.info('result save to {}'.format(args.output))
2 changes: 2 additions & 0 deletions ppstructure/docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ paddleocr --image_dir=ppstructure/docs/table/table.jpg --type=structure --layout
paddleocr --image_dir=ppstructure/docs/table/1.png --type=structure --recovery=true
# 英文测试图
paddleocr --image_dir=ppstructure/docs/table/1.png --type=structure --recovery=true --lang='en'
# pdf测试文件
paddleocr --image_dir=ppstructure/recovery/UnrealText.pdf --type=structure --recovery=true --lang='en'
```

<a name="22"></a>
Expand Down
2 changes: 2 additions & 0 deletions ppstructure/docs/quickstart_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ Please refer to: [Key Information Extraction](../kie/README.md) .
paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true
# English pic
paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true --lang='en'
# pdf file
paddleocr --image_dir=ppstructure/recovery/UnrealText.pdf --type=structure --recovery=true --lang='en'
```

<a name="22"></a>
Expand Down
Binary file added ppstructure/docs/recovery/UnrealText.pdf
Binary file not shown.
Binary file added ppstructure/docs/recovery/recovery_ch.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 37 additions & 37 deletions ppstructure/layout/README_ch.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,11 +160,13 @@ json文件包含所有图像的标注,数据以字典嵌套的方式存放,
```
mkdir pretrained_model
cd pretrained_model
# 下载PubLayNet预训练模型
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout.pdparams
# 下载PubLayNet预训练模型(直接体验模型评估、预测、动转静)
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams
# 下载PubLaynet推理模型(直接体验模型推理)
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar
```

下载更多[版面分析模型](../docs/models_list.md)(中文CDLA数据集预训练模型、表格预训练模型)
如果测试图片为中文,可以下载中文CDLA数据集的预训练模型,识别10类文档区域:Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation,在[版面分析模型](../docs/models_list.md)中下载`picodet_lcnet_x1_0_fgd_layout_cdla`模型的训练模型和推理模型。如果只检测图片中的表格区域,可以下载表格数据集的预训练模型,在[版面分析模型](../docs/models_list.md)中下载`picodet_lcnet_x1_0_fgd_layout_table`模型的训练模型和推理模型。

### 4.1. 启动训练

Expand Down Expand Up @@ -216,14 +218,14 @@ TestDataset:
# 单卡训练
export CUDA_VISIBLE_DEVICES=0
python3 tools/train.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--eval
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--eval

# 多卡训练,通过--gpus参数指定卡号
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--eval
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--eval
```

**注意:**如果训练时显存out memory,将TrainReader中batch_size调小,同时LearningRate中base_lr等比例减小。发布的config均由8卡训练得到,如果改变GPU卡数为1,那么base_lr需要减小8倍。
Expand Down Expand Up @@ -252,9 +254,9 @@ PaddleDetection支持了基于FGD([Focal and Global Knowledge Distillation for D
# 单卡训练
export CUDA_VISIBLE_DEVICES=0
python3 tools/train.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
--eval
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
--eval
```

- `-c`: 指定模型配置文件。
Expand All @@ -269,8 +271,8 @@ python3 tools/train.py \
```bash
# GPU 评估, weights 为待测权重
python3 tools/eval.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
-o weights=./output/picodet_lcnet_x1_0_layout/best_model
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
-o weights=./output/picodet_lcnet_x1_0_layout/best_model
```

会输出以下信息,打印出mAP、AP0.5等信息。
Expand All @@ -292,13 +294,13 @@ python3 tools/eval.py \
[08/15 07:07:09] ppdet.engine INFO: Best test bbox ap is 0.935.
```

使用FGD蒸馏模型进行评估
若使用**提供的预训练模型进行评估**,或使用**FGD蒸馏训练的模型**,更换`weights`模型路径,执行如下命令进行评估

```
python3 tools/eval.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
-o weights=output/picodet_lcnet_x2_5_layout/best_model
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
-o weights=output/picodet_lcnet_x2_5_layout/best_model
```

- `-c`: 指定模型配置文件。
Expand All @@ -325,18 +327,16 @@ python3 tools/infer.py \
- `--output_dir`: 指定可视化结果保存路径。
- `--draw_threshold`:指定绘制结果框的NMS阈值。

预测图片如下所示,图片会存储在`output_dir`路径中。

使用FGD蒸馏模型进行测试:
若使用**提供的预训练模型进行预测**,或使用**FGD蒸馏训练的模型**,更换`weights`模型路径,执行如下命令进行预测:

```
python3 tools/infer.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
-o weights='output/picodet_lcnet_x2_5_layout/best_model.pdparams' \
--infer_img='docs/images/layout.jpg' \
--output_dir=output_dir/ \
--draw_threshold=0.5
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
-o weights='output/picodet_lcnet_x2_5_layout/best_model.pdparams' \
--infer_img='docs/images/layout.jpg' \
--output_dir=output_dir/ \
--draw_threshold=0.5
```


Expand All @@ -351,9 +351,9 @@ inference 模型(`paddle.jit.save`保存的模型) 一般是模型训练,

```bash
python3 tools/export_model.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
-o weights=output/picodet_lcnet_x1_0_layout/best_model \
--output_dir=output_inference/
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
-o weights=output/picodet_lcnet_x1_0_layout/best_model \
--output_dir=output_inference/
```

* 如无需导出后处理,请指定:`-o export.benchmark=True`(如果-o已出现过,此处删掉-o)
Expand All @@ -368,27 +368,27 @@ output_inference/picodet_lcnet_x1_0_layout/
└── model.pdmodel # inference模型的模型结构文件
```

FGD蒸馏模型转inference模型步骤如下
若使用**提供的预训练模型转Inference模型**,或使用**FGD蒸馏训练的模型**,更换`weights`模型路径,模型转inference模型步骤如下

```bash
python3 tools/export_model.py \
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
-o weights=./output/picodet_lcnet_x2_5_layout/best_model \
--output_dir=output_inference/
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
-o weights=./output/picodet_lcnet_x2_5_layout/best_model \
--output_dir=output_inference/
```



### 6.2 模型推理

版面恢复任务进行推理,可以执行如下命令
若使用**提供的推理训练模型推理**,或使用**FGD蒸馏训练的模型**,更换`model_dir`推理模型路径,执行如下命令进行推理

```bash
python3 deploy/python/infer.py \
--model_dir=output_inference/picodet_lcnet_x1_0_layout/ \
--image_file=docs/images/layout.jpg \
--device=CPU
--model_dir=output_inference/picodet_lcnet_x1_0_layout/ \
--image_file=docs/images/layout.jpg \
--device=CPU
```

- --device:指定GPU、CPU设备
Expand Down
Loading

0 comments on commit 9c424ff

Please sign in to comment.