Merge pull request PaddlePaddle#7284 from WenmuZhou/table_pr

update inference_en.md
Liyulingyue · Aug 22, 2022 · d41d046 · d41d046
2 parents aedeb28 + 6eca179
commit d41d046
Show file tree

Hide file tree

Showing 10 changed files with 65 additions and 49 deletions.
diff --git a/paddleocr.py b/paddleocr.py
@@ -636,4 +636,6 @@ def main():
 
             for item in result:
                 item.pop('img')
+                item.pop('res')
                 logger.info(item)
+            logger.info('result save to {}'.format(args.output))
diff --git a/ppstructure/README.md b/ppstructure/README.md
@@ -106,9 +106,9 @@ PP-Structure Series Model List (Updating)
 
 |model name|description|model size|download|
 | --- | --- | --- | --- |
-|ch_PP-OCRv3_det_slim|[New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection| 1.1M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_distill_train.tar)|
-|ch_PP-OCRv3_rec_slim |[New] Slim qunatization with distillation lightweight model, supporting Chinese, English text recognition| 4.9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_train.tar) |
-|ch_ppstructure_mobile_v2.0_SLANet|Chinese table recognition model trained on PubTabNet dataset based on SLANet|9.3M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
+|ch_PP-OCRv3_det| [New] Lightweight model, supporting Chinese, English, multilingual text detection | 3.8M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar)|
+|ch_PP-OCRv3_rec| [New] Lightweight model, supporting Chinese, English, multilingual text recognition | 12.4M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) |
+|ch_ppstructure_mobile_v2.0_SLANet|Chinese table recognition model based on SLANet|9.3M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
 
 ### 7.3 KIE model
 

diff --git a/ppstructure/README_ch.md b/ppstructure/README_ch.md
@@ -120,9 +120,9 @@ PP-Structure系列模型列表（更新中）
 
 |模型名称|模型简介|模型大小|下载地址|
 | --- | --- | --- | --- |
-|ch_PP-OCRv3_det_slim|【最新】slim量化+蒸馏版超轻量模型，支持中英文、多语种文本检测| 1.1M |[推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_distill_train.tar)|
-|ch_PP-OCRv3_rec_slim |【最新】slim量化版超轻量模型，支持中英文、数字识别| 4.9M |[推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_train.tar) |
-|ch_ppstructure_mobile_v2.0_SLANet|基于SLANet在PubTabNet数据集上训练的中文表格识别模型|9.3M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
+|ch_PP-OCRv3_det| 【最新】超轻量模型，支持中英文、多语种文本检测 | 3.8M |[推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar)|
+|ch_PP-OCRv3_rec|【最新】超轻量模型，支持中英文、数字识别|12.4M |[推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) |
+|ch_ppstructure_mobile_v2.0_SLANet|基于SLANet的中文表格识别模型|9.3M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
 
 
 <a name="73"></a>

diff --git a/ppstructure/docs/inference.md b/ppstructure/docs/inference.md
@@ -16,23 +16,26 @@ cd ppstructure
 下载模型
 ```bash
 mkdir inference && cd inference
-# 下载PP-OCRv2文本检测模型并解压
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar && tar xf ch_PP-OCRv2_det_slim_quant_infer.tar
-# 下载PP-OCRv2文本识别模型并解压
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar && tar xf ch_PP-OCRv2_rec_slim_quant_infer.tar
-# 下载超轻量级英文表格预测模型并解压
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar
+# 下载PP-Structurev2版面分析模型并解压
+wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
+# 下载PP-OCRv3文本检测模型并解压
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
+# 下载PP-OCRv3文本识别模型并解压
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
+# 下载PP-Structurev2表格识别模型并解压
+wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
 cd ..
 ```
 <a name="1.1"></a>
 ### 1.1 版面分析+表格识别
 ```bash
-python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_infer \
-                          --rec_model_dir=inference/ch_PP-OCRv2_rec_slim_quant_infer \
-                          --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer \
+python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
+                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
+                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
+                          --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
                           --image_dir=./docs/table/1.png \
                           --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
-                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
+                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
                           --output=../output \
                           --vis_font_path=../doc/fonts/simfang.ttf
 ```
@@ -41,19 +44,23 @@ python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_i
 <a name="1.2"></a>
 ### 1.2 版面分析
 ```bash
-python3 predict_system.py --image_dir=./docs/table/1.png --table=false --ocr=false --output=../output/
+python3 predict_system.py --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
+                          --image_dir=./docs/table/1.png \
+                          --output=../output \
+                          --table=false \
+                          --ocr=false
 ```
 运行完成后，每张图片会在`output`字段指定的目录下的`structure`目录下有一个同名目录，图片区域会被裁剪之后保存下来，图片名为表格在图片里的坐标。版面分析结果会存储在`res.txt`文件中。
 
 <a name="1.3"></a>
 ### 1.3 表格识别
 ```bash
-python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_infer \
-                          --rec_model_dir=inference/ch_PP-OCRv2_rec_slim_quant_infer \
-                          --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer \
+python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
+                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
+                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
                           --image_dir=./docs/table/table.jpg \
                           --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
-                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
+                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
                           --output=../output \
                           --vis_font_path=../doc/fonts/simfang.ttf \
                           --layout=false

diff --git a/ppstructure/docs/inference_en.md b/ppstructure/docs/inference_en.md
@@ -18,23 +18,26 @@ download model
 
 ```bash
 mkdir inference && cd inference
-# Download the PP-OCRv2 text detection model and unzip it
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar && tar xf ch_PP-OCRv2_det_slim_quant_infer.tar
-# Download the PP-OCRv2 text recognition model and unzip it
-wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar && tar xf ch_PP-OCRv2_rec_slim_quant_infer.tar
-# Download the ultra-lightweight English table structure model and unzip it
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar
+# Download the PP-Structurev2 layout analysis model and unzip it
+wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
+# Download the PP-OCRv3 text detection model and unzip it
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
+# Download the PP-OCRv3 text recognition model and unzip it
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
+# Download the PP-Structurev2 form recognition model and unzip it
+wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
 cd ..
 ```
 <a name="1.1"></a>
 ### 1.1 layout analysis + table recognition
 ```bash
-python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_infer \
-                          --rec_model_dir=inference/ch_PP-OCRv2_rec_slim_quant_infer \
-                          --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer \
+python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
+                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
+                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
+                          --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
                           --image_dir=./docs/table/1.png \
                           --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
-                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
+                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
                           --output=../output \
                           --vis_font_path=../doc/fonts/simfang.ttf
 ```
@@ -43,19 +46,23 @@ After the operation is completed, each image will have a directory with the same
 <a name="1.2"></a>
 ### 1.2 layout analysis
 ```bash
-python3 predict_system.py --image_dir=./docs/table/1.png --table=false --ocr=false --output=../output/
+python3 predict_system.py --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
+                          --image_dir=./docs/table/1.png \
+                          --output=../output \
+                          --table=false \
+                          --ocr=false
 ```
 After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each picture in image will be cropped and saved. The filename of picture area is their coordinates in the image. Layout analysis results will be stored in the `res.txt` file
 
 <a name="1.3"></a>
 ### 1.3 table recognition
 ```bash
-python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv2_det_slim_quant_infer \
-                          --rec_model_dir=inference/ch_PP-OCRv2_rec_slim_quant_infer \
-                          --table_model_dir=inference/en_ppocr_mobile_v2.0_table_structure_infer \
+python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
+                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
+                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
                           --image_dir=./docs/table/table.jpg \
                           --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
-                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
+                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
                           --output=../output \
                           --vis_font_path=../doc/fonts/simfang.ttf \
                           --layout=false

diff --git a/ppstructure/docs/models_list.md b/ppstructure/docs/models_list.md
@@ -24,8 +24,8 @@
 
 |模型名称|模型简介|推理模型大小|下载地址|
 | --- | --- | --- | --- |
-|en_ppocr_mobile_v2.0_table_det|PubLayNet数据集训练的英文表格场景的文字检测|4.7M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_det_train.tar) |
-|en_ppocr_mobile_v2.0_table_rec|PubLayNet数据集训练的英文表格场景的文字识别|6.9M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_rec_train.tar) |
+|en_ppocr_mobile_v2.0_table_det|PubTabNet数据集训练的英文表格场景的文字检测|4.7M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_det_train.tar) |
+|en_ppocr_mobile_v2.0_table_rec|PubTabNet数据集训练的英文表格场景的文字识别|6.9M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_rec_train.tar) |
 
 如需要使用其他OCR模型，可以在 [PP-OCR model_list](../../doc/doc_ch/models_list.md) 下载模型或者使用自己训练好的模型配置到 `det_model_dir`, `rec_model_dir`两个字段即可。
 
@@ -36,7 +36,7 @@
 | --- | --- | --- | --- |
 |en_ppocr_mobile_v2.0_table_structure|基于TableRec-RARE在PubTabNet数据集上训练的英文表格识别模型|6.8M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar) |
 |en_ppstructure_mobile_v2.0_SLANet|基于SLANet在PubTabNet数据集上训练的英文表格识别模型|9.2M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_train.tar) |
-|ch_ppstructure_mobile_v2.0_SLANet|基于SLANet在PubTabNet数据集上训练的中文表格识别模型|9.3M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
+|ch_ppstructure_mobile_v2.0_SLANet|基于SLANet的中文表格识别模型|9.3M|[推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
 
 <a name="3"></a>
 

diff --git a/ppstructure/docs/models_list_en.md b/ppstructure/docs/models_list_en.md
@@ -36,7 +36,7 @@ If you need to use other OCR models, you can download the model in [PP-OCR model
 | --- |-----------------------------------------------------------------------------| --- | --- |
 |en_ppocr_mobile_v2.0_table_structure| English table recognition model trained on PubTabNet dataset based on TableRec-RARE |6.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar) |
 |en_ppstructure_mobile_v2.0_SLANet|English table recognition model trained on PubTabNet dataset based on SLANet|9.2M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_train.tar) |
-|ch_ppstructure_mobile_v2.0_SLANet|Chinese table recognition model trained on PubTabNet dataset based on SLANet|9.3M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
+|ch_ppstructure_mobile_v2.0_SLANet|Chinese table recognition model based on SLANet|9.3M|[inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_train.tar) |
 
 <a name="3"></a>
 ## 3. KIE

diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md
@@ -59,16 +59,16 @@ cd PaddleOCR/ppstructure
 # download model
 mkdir inference && cd inference
 # Download the PP-OCRv3 text detection model and unzip it
-wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.tar && tar xf ch_PP-OCRv3_det_slim_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
 # Download the PP-OCRv3 text recognition model and unzip it
-wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar && tar xf ch_PP-OCRv3_rec_slim_infer.tar
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
 # Download the PP-Structurev2 form recognition model and unzip it
 wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
 cd ..
 # run
 python3.7 table/predict_table.py \
-    --det_model_dir=inference/ch_PP-OCRv3_det_slim_infer \
-    --rec_model_dir=inference/ch_PP-OCRv3_rec_slim_infer  \
+    --det_model_dir=inference/ch_PP-OCRv3_det_infer \
+    --rec_model_dir=inference/ch_PP-OCRv3_rec_infer  \
     --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
     --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
     --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \